Crawler Questions

Are you a feed provider interested in working with Carsforsale.com? These are some of the questions we may be asking.

  • What are the outgoing IP ranges used by your traffic?
  • What are the user agents, if any, used by your traffic?
  • Are you able to leverage custom user agents?
  • Can you accept a token-based user agent?
  • Do you leverage cloud services to download our resources?
  • Do your crawlers respect robots.txt?
  • Do your crawlers have auto-throttle mechanisms built in?
  • Can you throttle your requests if thresholds are exceeded (429 Too Many Requests)?
  • Do your crawlers require the ability to scrape our properties?

Crawling Best Practices

  1. “Be Nice” and follow our website's crawling policies.
  2. Treat our web servers with respect by implementing auto-throttling mechanisms.
  3. Headless browsers are acceptable, however please ensure that an identifiable user agent is present with the request.
  4. Has your crawler been blocked? Is it receiving CAPTCHA pages or odd or unsuccessful responses? Reach out to [email protected], and we will work with you to determine why your crawler is having issues.
  5. Be respectful of our servers, and provide consistent and identifiable request information:
    • Reach out to [email protected] with your outgoing IP ranges and user agents.
    • Be sure to address the nature of your crawler in the correspondence.
    • This will prompt a higher success rate for requested resources pending your traffic disposition.

About Our Crawlers

Carsforsale.com-ImageFeeds/1.0 (http://developers.carsforsale.com/crawlers)
Given a list of image urls provided by an import, this crawler will request to download the resources.

Carsforsale.com-ImageDownloader/1.0 (http://developers.carsforsale.com/crawlers)
Given a list of image urls provided by an import, this crawler will request on demand to download the resource.

Verifying Authenticity

The above crawlers will identify themselves with the above names in the user agent string and have an outbound IP address within these IP ranges:

  • 198.185.165.254/32
  • 198.185.165.125/32

Supported Security Protocols

  • TLS 1.3 - Yes
  • TLS 1.2 - Yes
  • TLS 1.1 - Yes - (supported until 2019-01-19)
  • TLS 1.0 - Yes - (supported until 2019-01-19)
  • SSL 3 - No
  • SSL 2 - No

Ensure your crawler is updated to support the above protocols for optimal asset retrieval.

Crawl Rates

Our pulls are designed to be gentle with resource providers which helps ensure they are pulled successfully. If a non 200 status code is presented, our crawlers will throttle down their interactions if possible.

The above crawlers run 24 hours a day with the majority of requests being made between 12 AM and 6 AM CST.

Problems

If you notice issues with any of our crawlers, please contact [email protected] and state the issue in detail.