Different web sites have different ways of depicting information. Blending it into one unified & consistent
format that will allow cross-comparison, can be a challenge. Xtractly's robust architecture and staff take
care of the problem for you, leveraging our proprietary tools & know-how to deliver you the relevant data
you require. You just tell us what sites you want to harvest and what data to extract, and we'll create the
structure to facilitate it, delivering it in the format of your choice.
The degree to which the information is broken down into separate data-based fields is the degree to which
its information is "structured." Hence, a site like Ebay with its many categories, subcategories, and tabled
search results, would be considered highly structured, while a classifieds site, such as Craigslist, would have
mainly "unstructured data," -its ads being comprised of descriptive text stored under generalized cate-
gories. Data gathered from these unstructured sources will need some work to impose structure upon it
before it can be stored and searched along side data from structured sources. This can be complex and
expensive. But Xtractly has automated tools designed to keep the process affordable, by enabling our staff
to consistently extract & deliver your data with the greatest efficiency and at the lowest possible price.
The work required to present your data in the format that you'd like to receive it in, depends upon the sources
it comes from. Structuring will often involve merging fields, creating unified values, appending data, or drawing
inferences from contextual clues. Data may also need to be scrubbed against other lists, de-duped, or classified
according to taxonomical considerations. When any of these things are needed Xtractly already has the tools in
place to handle them all, translating to faster set-up and better pricing for you, our valued client. Evaluations
are free and we're always happy to consult with you about your project.
|