Dear all,
It has been a little while since we last updated on the project and now that 2014 has come to an end, it seems a good opportunity to relay some of the key events of the year and our plans for the future.
As of the end of December 2014, 746 people have signed up; we've identified more than 4000 RCTs/q-RCTs, and screened 104,000 records!
Those are metrics I think we can all feel very proud of.
This is the first formal project within Cochrane which has employed the use of crowdsourcing a specific task necessary to the maintenance of a Cochrane product or service. As such we have been monitoring three outcome measures carefully throughout the course of the year: accuracy, speed, and engagement.
ACCURACY
By this we mean: Are records ending up in the right place, and what is the crowd’s accuracy in comparison to methods used previously? We’ve conducted two validation exercises to try to answer these questions. Both exercises were run independently of each other and comprised ‘expert’ screeners screening several thousand citations and comparing the crowd’s decisions on those same records. The first time we did this, the expert was aware of the crowd’s decision. In the second validation, the experts were blind to the crowd decision. In both exercises crowd sensitivity and crowd specificity (i.e. the ability of the crowd to correctly classify the citations that should be in CENTRAL and the ability of the crowd to correctly classify records which should not be in CENTRAL) came out at over 99% for both measures. This is about as good as it gets.
SPEED/TIME
It takes on average 35 seconds to screen a record. Reject records are significantly quicker to screen than non-Reject records - which is good because there are lots more of them! On the whole we have kept up with our hoped-for schedule well given that we were always going to need some run-in time (we went live in late February with only about ten screeners so we always knew that this year would be largely about building capacity and increasing our numbers).
We’ve done work to try to understand the effect of the highlighted words and phrases on screener performance, especially with regard to speed. Rather unsurprisingly we found that having key words and phrases highlighted has significantly reduced the time it takes a screener to classify a record. However, when we quantified this, we were able to see just what a difference this feature has made: after 6 months of screening approximately 1000 hours of screening time had been saved due to the highlights (and mainly due to the red highlights) because it takes on average almost twice as long to screen a record when the highlight function is switched off.
ENGAGEMENT
Having over 700 people sign up to do this is our first positive indicator that this is actually something people either want to do or don’t mind doing because it is for a good cause. Many of those who have signed-up have gone on to screen hundreds of records. That said, we know there is much more we could do to make this a more satisfying, even fun, experience. These include: better feedback mechanisms, better opportunities for screeners to be able to engage with each other should they wish to do so (screening citations can be a lonely business!), better incentives to take part and return, we hope, again and again, and technical changes to the tool which would improve responsiveness as well as enabling offline screening.
48hr SCREENING CHALLENGE
This was definitely one of the highlights of the year for me. At the end of October/beginning of November we held our first citation screening challenge. Together, 75 of us managed to screen over 20,000 citations in a 48hr period and raise more than £5000 for the Ebola relief effort. Here’s the link to the Cochrane news piece about it: http://www.cochrane.org/news/news-events/current-news/how-many-randomised-trials-can-we-find-48-hours-0
Many have asked when the next challenge will be and we are working on a plan for that and will keep you posted.
What does 2015 hold in store?
We plan to do a number of things next year:
- Get version 2 of the tool out. The tool we have now has been fantastic but we’ve always known that there are features and functions screeners would really like to see.
- Explore the role of machine and text mining. We have a year’s worth of entirely human generated data. We now need to see what role the machine could play in helping screen and classify citations. We are in the process of putting together a Cochrane discretionary fund bid on this.
- Expand on the training provided and improve the feedback mechanisms. This ranges from providing more practice records for screeners to delivering tailored feedback on records where the classifications have not been unanimous.
- Run more sub-projects involving anyone who wants to take part - we have so much to test and a lot of generated data to analyse. Many of you have expressed an interest in helping us which is fantastic and we will set up a programme of work in this area.
- Run more screening challenges!
I’ve tried to keep this update brief, believe it or not, and report only the headline metrics. For some that will be enough but I know many of you are interested in hearing more about the project and the data we have generated so far. Rather than try to cram all that information into one update email we are going to set up a project blog and we’ll email the details as soon as we have done this.
It’s has been an amazing year for this project; well done and thank you all very much.
With best wishes,
Anna, Gordon and Julie
Happy screening!