Skip to main content

White House Calls for AI Mining of New COVID-19 Database

The White House is calling on the AI community to tap into a new machine readable database to help answer questions about the coronavirus by the World Health Organization and the National Academies Standing Committee on Infectious Diseases. 

That came Monday (March 16) in a briefing by the White House Office of Science and Technology Policy (OSTP) on a new machine readable COVID-19-related dataset that can unite the AI and medical communities in trying to answer those questions using the data from some 29,000 articles (13,000 of them with full text, which are the most useful) and the summarizing and analyzing powers of AI, which in this case means "augmented intelligence." 

Partnering on the effort to create the data set and challenge the AI community, and anyone else for that matter, to answer those pressing questions, were the Allen Institute for AI, the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft, the National Library of Medicine at the National Institutes of Health, and Kaggle.

Dr. Oren Etzioni of the Allen Institute for AI, one of the partners, said that they are looking to get full text (data) for the rest of those articles. 

Etzioni said that high tech had gotten a bad rap of late, but the effort shows that it can do a world of good by helping scientists research the disease "quickly and effectively."

He said while AI is criticized in relation to facial recognition and deep fakes, it is on the front lines of fighting the virus through this initiative. But he pointed out the dataset was open to anyone. 

The questions include what is known about transmission and incubation of the disease and the answers can help those working on vaccines or on guidance about social interactions. 

The White House last week asked science and tech publishers to allow any relevant material on the COVID-19 coronavirus to be compiled into a free, open and accessible database whose text can be mined and collated. The White House said those publishers had responded to the call for open access and the data set is now ready for data mining, and continued to do so. 

U.S. CTO Michael Kratsois said that it is tough to get machine readable rights to all the info, so the point was to get an updatable, one-stop-shop for info on the virus. He said he expected it would have a catalytic effect on how research is done going forward. 

The effort partners are looking for the press to get the word out on the dataset availability to get more eyeballs on the info.