Newstex is a provider of digital content that is curated for analysts, researchers, public relations professionals, and content marketers.
We serve information product managers who need to provide these audiences with comprehensive coverage of emerging and fast-moving trends.
We have secured license and distribution agreements with over 5,000 digital publications and we deliver over 1 million original articles per month.
The majority of these publications specialize in highly topical aspects of business, education, finance, health, journalism, law, lifestyle, politics, science, and technology.
Receive content from thousands of sources in a single feed.
Enjoy a portfolio of fully licensed content that is always growing.
Obtain the full text of each article in standard XML format.
Newstex content covers emerging trends in business, education, finance, health, law, politics, lifestyle, science, and technology.
You will not have to manage multiple feeds. Newstex makes it easy.
Your content needs will be met through a single feed that includes all summary, author, publication, and other important metadata.
All Newstex content creators sign full-text distribution and archiving agreements.
We offer full-text articles from numerous popular websites.
You will not have to worry about copyright issues.
When you receive content from us, you can be sure that the copyright permissions are all in order.
Newstex polls roughly every 6 hours on each feed. We can specify a given feed to poll more or less often, but we can not specify set times for when a feed is polled.
If you have specific publications you need to make sure are processed more quickly, please let us know.
Files may be delivered to you via one of three methods.
Newstex can send content to you via an HTTPS POST or PUT operation. In cases of HTTPS push, you can specify a base URL, and we will issue a POST with the body of the file in the request body and the file name in the URL. Note: we will re-try any requests that return any status code other than 2xx.
Newstex can send content to you via a standard FTP or SFTP connection. In these cases, Newstex will deliver the file to an FTP server you host and then verify the file size. Newstex supports both Active and Passive FTP connections. We also support an optional base directory for uploading files into.
Please make sure to specify:
3. Host name
4. Base directory
5. Active or passive connection
Important Note: Since Newstex processes thousands of files per day, we may issue multiple connections simultaneously to the same server. If you have connection concerns, please make sure to let us know during signup.
Amazon S3 Delivery
Newstex can send content to you via an Amazon S3 Bucket.
In order to deliver content to you via S3, please provide:
1. AWS region
2. S3 bucket name
3. Your S3 user email or canonical ID
When setting up an S3 delivery, you will need to grant our account access to your S3 bucket. Please reach out to us for more details on this setup.
This section will go over the different formats for stories that we accept. This is not to be confused with transmission types, such as SFTP or HTTPS Post.
Story XML is a simplified version of XML that contains a <story> element with basic XML tags for each metadata field:
3. HTML content
5. DatePublication ID
6. Publication name
For a full list of metadata and sample XML files, please contact us.
For distributors who want to process data in a more modern and transmissible format, we can also deliver our content via JSON. Data fields are the same as are available in XML, however JSON is typically easier to parse in modern languages.
All formats provide the content in HTML. Please make sure you can process all HTML elements. We recommend not attempting to parse the story body as these stories come from thousands of different publishers, all with their own unique writing and formatting styles. Do not attempt to parse HTML as XML.
Plain text variations are also available, however those variations strip all formatting so they should only be used for indexing content, not displaying it.