Text extraction with mindUp web content crawler/spider
Intelligent and adaptive web crawler system for automatic scan of web sites and pages.
Fully automatic content extraction of structured knowledge out of unstructured data.
Web crawler features:
- Scalable on any project size
- Daily millions of web pages
- Extraction tasks are adjustable at will (extraction agent)
- Adaptive scanning (domain scanning)
- Bot conformity (respects "robots.txt")
- Web farming
mindUp is an experienced specialist for content detection and data extraction. This can be done on products or documents or on collected data to generate market data or price comparison.
Crawler technology in combination with content extraction enables a wide variety of applications: