For Ben Draves ’17 and Josh Arfin ’17, the Wesleyan University DataFest was another chance to hone their statistics skills in preparation for graduate school. When they showed up, however, their team of two looked statistically significantly smaller than the other teams.
“We didn’t really understand the scope of the competition when we signed up for it,” Arfin said. “We figured ‘Oh, just the two of us will be fine.’ As it turned out, it was mostly teams of four and five.”
Fortunately for them, one player’s team didn’t show, and he joined their team. The three of them, called the “Unsupervised Leopards” won “Best Data Preparation” of all seventeen teams and three awards. In the judges’ eyes, the team used the data to reveal new and impressive findings.
The competition, which took place at the beginning of April, required teams made up of three to five students to analyze huge data sets and “try to find an interesting story in that data,” Draves said.
The data comes from a large company, and the company is not disclosed until all DataFest competitions are finished. Last year’s company was Ticketmaster, according to Arfin. Teams are given 48 hours to complete their work and turn it into the judges’ panel. The team slept for six or seven hours a night and then worked straight through the day.
Spending 48 hours with a new team member, Tiger Huang from Wesleyan University, worked out well, Draves said, because he brought a different skill set and together they were able to combine their knowledge.
Draves and Arfin became interested in working with data their freshman and sophomore years, respectively. After taking probability and statistics courses, both Draves and Arfin did separate research projects, and “that was about the time we both decided we wanted to get a graduate degree in statistics,” Arfin said.
“So we’ve been spending the last few years really working on these skills,” Arfin added. The math department suggested that they go to the competition and funded the three-day trip.
The 48-hour event required the boys to buckle down.
“There’s a lot of stuff that is kind of progressive in nature,” Draves said, “so you have to solve the first problem and then after that’s done, you can move onto problem two, after that’s done you can work on problem three. So it wasn’t like we could just throw something into the computer and then go work on something else, we had to spend a lot of time just working sequentially, which put a lot of time pressure on us.”