I took this week out of my studies in order to learn about Big Data at my university’s summer school on the subject. As I enjoy research and am interested by different types of information, and am also trying to expand my skills and get employers to hire me after my PhD, I thought it would be worth going to, and I did indeed learn some new things.
The main outcome for me was that I can now say with more confidence what Big Data is, and what processes are involved. I did a bit of reading before the summer school in case I found myself having no idea whatsoever what was going on, and even though some of the things I read were repeated in the course, many concepts I was confused about or that were poorly explained in the books were clarified for me. For example, I got confused by a lot of the words that were bandied around in the books and what they related to; I now know that Hadoop is to MapReduce (which as far as I can tell is basically tallying up A LOT of VERY HIGH frequencies very fast) is what Word is to word processing - when it’s all unfamiliar, it’s easy to get confused between processes and program names! One thing that continues to overwhelm me, though, is the number of applications and coding languages/algorithms that basically do the same thing - more on that a bit later on. There was a lot of sitting and listening rather than actually getting to play with the computer functions ourselves, and I did suffer a bit of ‘death by PowerPoint’ - but maybe that’s my own fault for choosing the very general subjects and avoiding anything that looked potentially too advanced for me (looking back, I reckon I would happily have been able to take the R course, which I chickened out of when I was choosing my modules).
The summer school also reinforced and added to my quantitative knowledge. In my occasional (often depressing) job searches, I find that employers want quantitative as well as qualitative skills, so it’s reassuring to know that, ten years after statistics GCSE, and seven years since I did a Methods In Social Research module, I still have them. In the Introduction to Data Mining course, for example, we were taken through processes of cleaning up data (for example, dealing with anomalies, missing data and mistakes), as well as classification and clustering, and those were things I could see myself doing as part of a job. I also took a short course on data protection, which was generally interesting, and I now understand a lot of things about EU law (such as the difference between a regulation and a directive) that I didn’t before! Data protection, and what changes in the law on it are coming (mostly on the consumers’ side!) is something we should all be concerned with anyway.
So, where next? I’m not sure, really. For one thing, who’s going to employ a history PhD who’s dabbled in Big Data techniques over someone who has a PhD _in _Big Data techniques? Maybe the techniques are something I can apply to future history projects if I have any, though my first love remains the individual story. I’d probably use Big Data techniques to get a broad picture, then zoom in (think of starting off with a whole country on Google Maps and then pressing the plus until you can see one block or roof!) on the cases that interest me - generally, the outliers, people who can’t be classified easily! Also, as I mentioned earlier, there are so many different coding languages and applications out there, and it’s hard to know which to learn. I’ve been told that R is probably the one to go with, but apparently it can take some time to pick up, and would that be more fruitful for me than concentrating on and emphasising the skills I’m picking up with my PhD anyway - teaching, public speaking, and conference organisation, including social media gubbins? I suppose I could save learning to code for the months when I’m unemployed and desperate for something to occupy me…
Either way, it’s been an insightful week. I’ve come across people whose jobs and projects sound really cool, got to eat at Wivenhoe House (the posh hotel on campus) and saw Campus Cat several times as I was in the building where he likes to hang out. And now, back to my stories.
Campus Cat looking after the summer school reception desk.