A web-crawler is an application that extracts data from a web page and manipulates it into a “more usable” format. Crawlers are typically built for a specific web-site and purpose. Properly done, a crawler will emulate a user’s behaviour whilst shielding the scrapers true identity through proxy-servers. They are also known as data collectors, data extractors, web crawlers, web scrapers and web-site rippers.
It depends on numerous factors, including:
We offer both types of billing structures:
Multi-crawler projects or long-term ongoing projects obtain volume discounts.
In certain situations data scraping is considered unethical or even illegal. Much depends on how it is done, the type of data extracted and for what purpose the extracted data will be used. Each data scraping project thus needs to be assessed on its own merits. If in doubt, please obtain your own legal advice.
We have been doing this since 2005 and have not heard of any legal problems from our clients, for their use of the scraped data.
We value client confidentiality and discretion. We also scrape websites in a highly anonymous manner that is impossible to trace back to us (and you). We can covertly get the data for you, but what happens thereafter, depends on what you do with it.
Yes, we are a data focussed company that provides a full suite of data analysis, data-cleansing, data mining and data warehousing services.
Yes you can, however you may have to install additional software, and obtain access to proxy-servers.
Please Contact us with an overview of your project requirements. We will setup an introductory call with you to discuss your project and better understand your requirements. We will review the sites with your intentions in mind and then provide a initial assessment of expected costs. We offer you a free proof of concept for 1-2 websites with one-time data extraction. Once you are satisfied with the proof of comcept results, we will generate a final quote which takes approx. 3-4 days. Upon final agreement on the quote, we sign a mutual non-disclosure agreement, if required. Once we receive your signed Letter of Engagement, we commence the development of the crawler(s).
Once completed, crawlers are tested and reviewed. We also provide you with an initial data extract for your review and approval.
Once approved, we transfer the crawler into production mode sharing data in agreed format and getting feedback at pre-agreed intervals. We also take care of any post-production data processing including parsing, standardisation, normalisation and de-duplication. Once deliverables are approved, we submit invoice, which is due for payment in 15 days.
Payment by direct deposit into our bank account. We do accept PayPal, however we include the (somewhat expensive) PayPal fees to your invoice. Please note that we do not have credit card facilities.
Invoice for each month is raised on the 15th day of each month. Payments are expected within 15 days from the date of invoice.
If payment has not been received within 30 days, a fee of 2 % of the value of the invoice will be charged. This fee will be charged every 30 days until the invoice is paid in full.
Yes, we can provide you the crawler source code at no extra charge, however you may need to install additional software, and/or obtain access to proxy-servers, before it can be used on your system.
Data is provided in the agreed format either by email, via a shared DropBox folder or uploaded directly to your Amazon S3 account. This includes any extracted images, PDFs and documents where required.
We provide a fully managed solution where we take care of all the entire scraping process so that you can receive freshly-harvested data, without the fuss & hassle.