Wednesday, May 17, 2006

Data mining for flu

Data mining is in the news these days with stories the Bush administration has been using automated computer techniques to sift through the phone records of tens of millions of US citizens looking for -- who knows what. One imagines the secret government eavesdropping agency, the NSA, to be using the latest in data mining technology, but more likely they are using some older technique because that's the kind of incompetency we've come to expect from this administration. Fortunately, in this case.

But the use of algorithms to find patterns in large piles of data is an important area of applied and theoretical computer science and is being used in many areas. The Department of Homeland Security just announced it would let contracts this summer to move forward on the National Biosurveillance Integration System.
The biosurveillance system will aggregate and integrate information from food, agricultural, public health and environmental monitoring and the intelligence community from federal and state agencies and private sources to provide an early warning system for an outbreak or possible bioterrorism attack.

“By integrating and fusing this large amount of available information, we can begin to develop a baseline or background against which we can recognize anomalies and changes of significance indicating potential biological events,” he told the House Homeland Security Committee’s prevention of nuclear and biological attack subcommittee yesterday.

DHS will combine the biosurveillance patterns and trends with threat information and include the completed product in its Common Operating Picture, which DHS distributes through the Homeland Security Information Network. The biosurveillance system will also send back to its system partner agencies completed situational awareness in real-time streams. (Government Computer Network)
The system, designed to detect bioterrorism events (for which it will be totally useless), is also being touted as a possible early warning system for a bird flu outbreak in the US (for which it might work but be superfluous in the setting of outbreaks elsewhere). Whether it would work or not once up and running, critics are saying the timelines announced in the flu plan are grossly unrealistic:
Dr. Rex Archer, health director of Kansas City, Mo., and president of the National Association of County and City Health Officials, said NBIS would need a level of funding and national commitment comparable to the 1960s’ space race to deliver what the plan calls for within a year. (Government HealthIT)
The flu plan also envisions an expansion of CDC's BioSense project, an ambitious system for streaming real-time data from hospital emergency rooms to CDC. CDC intended for BioSense to gather data from hospitals in 31 cities by the end of 2006. The pandemic plan calls for CDC to collect data from 350 hospitals in 42 cities within 12 months. Many are skeptical BioSense will work at all, but the forced expansion in a system already stressed, underfunded and reeling under budget cuts is not just premature but probably sure to fail.

Bottom line: the surveillance component of the flu plan sounds good, but it's bogus. Like a lot of what this administration does.