This is a video illustrating historical oil production in the Bakken field in North Dakota. We put it together at Drilling Analytics to show what our data and analytics were capable of. It's a great example of using Machine Intelligence to gather, analyse and process data in a meaningful way.
Although this data is publicly available, it's extremely hard to access, process and collate in this way. The data is contained in old "End of Well Reports", which are written by Well Engineers at the completion of an oil well, and submitted to the North Dakota Industrial Commission. A typical report is around 200 pages long, poorly structured, and in a pdf image format. Oil production data is available elsewhere, but needs tying back to source.
This video contains information on around 50,000 oil wells, so we created a highly automated system to extract and process this data: -
Found each well reported to the North Dakota Industrial Commission
Accessed the End of Well Report pdf for each well
Used image recognition to process the pdf images into text, then word analysis to clean up typos
From the text representation, we searched for keywords and used context analysis to determine name, location, data and other parameters
This information allowed us to determine the Lat-Long coordinates of the wellhead
Tied the well name to oil production reports, and pulled these into our database
Calculated monthly oil production from these reports
Used matplotlib to build monthly bubble chart images of oil production
Compiled the bubble chart images into an animation
Posted on YouTube
More than anything, this shows the ability to tie together multiple image recognition, NLP, processing and machine intelligence algorithms to automate a task that would take a human years to tedious work to complete.