The web archiving of Irish election campaigns: A case study into the usefulness of the Irish web archive for researchers and historians.
Presented at the joint RESAW/IIPC Conference, School of Advanced Study, University of London, 14-16 June 2017.
To date, I can find no publication of Irish research that engages with the use of the Irish web archive as a primary source for research. So, a purpose of this paper is to initiate a discussion for the use of the Irish web archive, in the hope that it will encourage more Irish researchers and historians to use the resource.
The Irish web archive is hosted by the National Library of Ireland (or NLI for short). In 2011, the NLI collaborated with the Internet Memory Foundation to organise a web archiving project for the 2011 General Election. It was the first time that a themed web archiving project was organised in Ireland. Since then the NLI has taken great strides to secure a web archiving programme for the capture of Irish online social, cultural and political heritage. By means of a case study, this paper examines the usefulness of the resource for the study of election campaigns and political history. The approach for this study is deliberately designed for a researcher with little or no programming experience, but who has some computer skills and is accustomed to using the live web for research.
In Ireland, prior to the 1990s, presidential elections received far less attention from scholars in comparison to general elections. This is in some part due to the nature of the office and perceived role of the President of Ireland. The President acts as the Head of State for visiting dignitaries, but the position has few powers of substance with the most prominent being:
- the power to refer bills to the Supreme Court to determine their constitutionality;
- the power to refuse a request by the Taoiseach (prime minister) to dissolve parliament.
While the President is directly elected, the position has been largely symbolic and somewhat seen as a service reward for retired politicians and statesmen (O’Malley, 2012: 636; Galligan 2012: 596).
However, the elections of Mary Robinson in 1990 and subsequently of Mary McAleese in 1997 and 2004 have since ignited much interest in the perceived role of the Irish President. Both Robinson and McAleese were renowned for their activist roles during their tenures and were also noted for raising a few eyebrows in the government (Galligan, 2012: 596). They restructured the nature of the role, away from elder statesmen, and evoked public pride in its functions (Ibid. 597). Following this, the 2011 presidential election was widely anticipated – and did not disappoint!
In Ireland, presidential election campaigns differ from general election campaigns in so far as there is less emphasis on door-to-door canvassing, or meeting with the electorate. Rather, O’Malley (2012: 642) observes that the media is the arena where campaigns are fought. Compared to Irish general elections with hundreds of candidates, there is usually only a few candidates in a presidential election and broadcast media is obliged to give each candidate equal coverage. In particular, for the 2011 election, O’Malley (Ibid.) suggests it was not a case of competing for coverage but “the nature of the coverage that mattered.”
There were seven candidates in the 2011 election; three were nominated by political parties, while four opted for the rarely used process of attaining a nomination from a consortium of local government councils. Deemed as being the most negative and dirty presidential campaign to date, it was also the first presidential campaign to witness the use of social network sites (SNS), and each candidate had their own website (Graham and Hogan, 2014: 32).
Rationale for the Research
The rationale for this research stems from the point that despite successful progress being made to capture and curate web archived materials, there is still a lack of engagement by researchers to use web archives as a source for research (Dougherty et al., 2010; Meyer et al., 2011; Pennock, 2013; Truman, 2016; Schroeder and Brügger, 2017).
Another rationale for this study evolves from the earlier research I conducted for my MA thesis, entitled: “The web archiving of election campaigns: what gets captured and why?” The purpose of the research was to identify, assess and compare the criteria assessments for the selection and capturing of online election campaigns in the Library of Congress Web Archive, UK Web Archive and the NLI Web Archive. While the concept seemed straightforward, there were many challenges; one, in particular, was the lack of accessible information for voluntary thematic collections in terms of the selection criteria, and the scope and content of a particular collection.
For example, in 2015, the NLI indicated that they had several collections in their web archive; however, there were no collections visible to the end-user in the actual resource. Rather it was merely an A-Z listing of every single site that had been captured by the NLI since 2011. Also, in 2015, there was only one briefing document available online with respect to the selection criteria, and the scope and content of a collection. This was for the 2011 general election (GE11_WebArchive_SelectionCriteria). Yet, the NLI added several other thematic collections to the archive such as the 2011 presidential collection, but there was no documentation available to figure out what belonged where. To add, while the GE11 briefing document was very beneficial, it failed to list the sites that were purposefully captured as part of the collection, which was not useful for my research.
On the other hand in 2017 – While there are still no apparent collections visible to the end-user in the resource, and it is still an A-Z listing of every single site that has been captured since 2011, the NLI staff have been extremely busy.
Their website now provides links to documentation for their voluntary thematic collections in terms of the background of the collection, the selection criteria, the crawl parameters, and the scope and content with a listing of harvested sites (NLI Web Archive Collections). Their efforts need to be highly commended, as, by the addition of these scoping documents, it fundamentally changes the dynamics of how we can use the resource.
This brings me to the case study and research questions.
Using the NLI 2011 Presidential Election collection as a case study, this research is concerned with 2 very simple research questions:
- How useful is this collection in terms of what is saved?
- How useful is this collection for researchers and historians?
The approach for this study is designed for a researcher with little or no programming experience, who has some skills with Microsoft Excel and is accustomed to using the live web and secondary sources for research.
The collection document for the 2011 presidential election outlines the background of the collection, the site selection, the crawl parameters, the general selection criteria and the scope and content of the collection. It also provides a list of the collection content in alphabetical order which contains the name of the website or web section, URL of the live website and also provides the URL for the captured site in the web archive.
In a closer analysis of the content selection, at the time of this study (01/05/2017 to 01/06/2017), there were:
– 66 Names of websites/ web sections listed
– 66 Live Website URLs listed for above
– 66 Web archive URLs listed which direct to the first captures
(To Note: 2 of the captured websites were not operating at the time of the study)
In configuring – How useful is this collection in terms of what is saved?
I used an Excel spreadsheet to record the website titles, the website URLs and designed a set of statements to test the collection. The collection was tested from 01/05/2017 to 01/06/2017.
I started with the task of configuring how many of the URLs of the selected websites were still valid, meaning that the “Live” URL provided in the selection still operated in a browser without giving an error (e.g. 404, 403 etc.). In doing this, I also noted how many URLs redirected, meaning the “Live” URL does not give an error but redirects to another URL in a browser.
As you can see, 41 URLs are still valid, 7 URLS redirect, and 18 URLs are Bad URLS.
In terms of percentages, 62% of the URLs are still valid on the live web, with 27% registering some kind of page not found error and 11% redirect to an alternate URL.
Next, I looked at whether or not the “Live” URLs direct to the actual name/title of the website/ web sections as indicated in selection.
37 of the URLS direct to the designated name/title of the website/ web section as indicated, but 29 do not.
In terms of percentages, 44% of the URLS do not direct to the designated website title as indicated in the NLI scoping document.
So what do I mean by this?
If I put the URL in the browser, it does not redirect but arrives at a website which is obviously not the intended resource.
For example, the URL provided for the websites of the Christian Solidarity Party and the Workers Party of Ireland – arrive at websites in Japanese. So, even though the URL is still valid, as in it does not give an error, it does not direct to the name of the designated site.
Finally, I went about the task of finding out how many of the captured websites are still available on the live web, and how are many are gone?
Of the 66 websites that were harvested for this collection, only 17 are still available on the LIVE web.
In terms of percentages – 74% of the captured websites in the NLI 2011 presidential election are currently not available on the Live Web.
What do I mean by that?
If we compare the Fis Nua archived website archived (October 2011), with the Fis Nua current website (June 2017), we find that they are completely different sites. Most of the content on the 2011 site is nowhere to be found on the 2017 site. A similar story with the presidential web section in Irish Central that was captured in 2011 cannot be located, and any old links in the 2011 archived site – return errors on the 2017 site.
It is not simply a case of Bad URLs, or URLs that redirect, it is also the case that websites are prone to be deleted, upgraded, move to different domains and undergo dramatic content changes – This exercise emphasises the impact of what was saved by the NLI web archive, particularly when we NOW see what is lost.
This brings me on to the second research question: How useful is the collection for researchers and historians?
As mentioned earlier, the 2011 presidential election was widely anticipated, and did not disappoint! Noted as being one of the dirtiest presidential campaigns ever, it was media frenzy from start to finish. It was also the first presidential campaign in Ireland to witness the use of social network sites (SNS), with most candidates utilising Facebook, Twitter, YouTube and Flickr (Graham and Hogan, 2013: 32). The seven candidates also had their own websites.
|Dana Rosemary Scallon||http://www.danaforpresident.ie/|
|Michael D. Higgins: The President who will do us proud||http://www.michaeldhiggins.ie/|
|Martin McGuinness: The People’s President||http://www.thepeoplespresident.ie/|
|Mary Davis for President||http://www.marydavis.ie/|
|Sean Gallagher: For an Independent President||http://www.seangallagher.com/2011/|
I began by testing the target URLs belonging to each candidate’s website to configure whether the sites are still available on the live web. I found that six of the presidential candidate’s websites have since disappeared, while the URL for the website for elected President, Michael D. Higgins, redirects to the official President of Ireland website, and so in effect, the original website cannot be found either.
It is not unusual for the websites of election candidates to disappear after an election, or indeed for government departmental websites to undergo change due to a turnover in government after an election (Crook, 2008; Aubrey, 2010; Koerbin, 2013).
It could be argued that this a valid justification for why web archiving institutions conduct focused thematic crawls during election campaigns. Indeed, it may be further argued that the web archiving of election campaigns and government websites is essential for the preservation of the historical, cultural, social and political record.
Fortunately the NLI included all candidates’ websites in their focused thematic crawl for the 2011 presidential election – However, while all candidates’ websites were reportedly captured, unfortunately, there is a blip with accessing the captured site of one candidate – being that of Guy Mitchell.
Compared to other candidates in the 2011 presidential election, the incumbent Michael D. Higgins did not seem to suffer as much of the media frenzy; in fact, for the most part, he seemed to have a skeleton-free campaign (Irish Central).
Although soon after he was elected as President, a different type of headline started to appear, with claims that the newly elected president, Michael D. Higgins was anti-American, also that he was anti-Israel (Irish Central; TheCommentator; Arutz Sheva 7; Anson).
While these headlines did not seem to find their way into mainstream Irish print/web media, nonetheless, Dr Mark Humphreys, a distinguished DCU scholar, claims (on his blogsite) that Michael D. Higgins is
- “The most anti-Israel head of state in the western world” and also
- “The most anti-American head of state in the western world”.
Humphreys uses an assortment of traditional and new media items to back up both of these arguments. Of particular interest to me is Humphrey’s use of some screenshots of press releases that were published on the website of Michael D. Higgins in 2004.
- “Bush your war is wrong”
- “Higgins pays tribute to Arafat, founding figure of modern Palestine”
At the time of these press releases Higgins was the Labour Party’s foreign affairs spokesman. As we now know, the website of Michael D. Higgins is no longer available on the live web so it would be difficult to refer back to the original source of the press releases that Humphrey uses to back-up his argument. Furthermore, it would also be difficult to see if these specific press releases are taken out of context, or how they relate in comparison to other press releases, and events of that time.
On the other hand, we also know that the NLI captured Higgins’s website as part of the 2011 presidential collection. So, in a novel approach, I thought it would be interesting to do a close and distant reading of Higgins’s press releases from 2004-2006 to look for some evidence that either supports or deflates Humphrey’s arguments that Higgins is both anti-American and anti-Israel.
First, I copied and pasted all of the available press releases into an Excel spreadsheet, with date, heading, and body. A small number of the press releases did not harvest properly in the NLI web archive – I can’t be sure, but I could hazard a guess that 15-20 are missing from the original site – this left 137 press releases to work with.
I then took all the body texts and put them through Voyant tools to try and get some idea of the type of topics Higgins referred to, and begin to form some categorisations. It soon became clear that press releases concerned with Iraq were often related to America and a similar story for Israel and Palestine.
But it was only through a close reading of each press release that I was fully able to assess the context and designate a category, and these were soon narrowed down with the following labels:
- America / Iraq
- Israel / Palestine
- Overseas Aid /Development
- Domestic Affairs
- Other Foreign affairs
So what did I find!
Certainly, the press releases indicate that Higgins was preoccupied with America / Iraq, and Israel / Palestine more than any other foreign affair during this time period. Indeed, using these press releases one could support an argument for Higgins being anti-Iraq war and disgruntled with American foreign policy. However, in my opinion, based on these sources, I could not find any evidence to substantiate an argument for Higgins being anti-American.
In the context of whether Higgins is anti-Israel?
Based on these press releases, one could establish an argument for Higgins being pro-Palestinian, and perhaps even anti-Israel foreign policy, but in terms of being anti-Israel – I am an early career researcher, and I really like what I do, so, I think it might be wiser for me to wait until after the next election before making any further comments on that.
With regards suggestions for content to be captured by the NLI for #Áras2018:
Some of the Twitter accounts of the #Áras2011 candidates also disappeared after the election; so the Twitter accounts of candidates and also the political parties/local governments who nominate candidates would be a welcome addition – as well as the addition of the blogsite of Mark Humphreys.
Every link in this post has been saved in Internet Archive: Wayback Machine (July 2017) It is inevitable that links become broken over time, so if you come across a link in this post that does not direct to the intended source, please copy the URL and paste in the WB search box.
- Aubry, S. (2010). “Introducing Web Archives as a New Library Service: the Experience of the National Library of France.” Liber Quarterly, vol 20, no.2, pp 179-199. <https://www.liberquarterly.eu/articles/10.18352/lq.7987/>
- Crook, E. (2008). “Web Archiving in a Web 2.0 World.” Proceedings of Dreaming 08 – Australia Library and Information Association Biennila Conference, 2-5 September 2008. National Library of Australia Staff Papers [NLASP], 10 Nov 2008. <https://www.nla.gov.au/content/web-archiving-in-a-web-20-world>
- Dougherty, M. et al. (2010). Researcher Engagement with Web Archives-State of the Art. Joint Information Systems Committee Report, August 2010. London: JISC. SSRN
- Gallagher, M. (2012). “The Political Role of the President of Ireland.” Irish Political Studies, vol. 27, no. 4, pp. 522–538. Taylor and Francis
- Galligan, Y. (2012) “Transforming the Irish Presidency: Activist Presidents and Gender Politics, 1990–2011.” Irish Political Studies, vol 27, no.4, pp. 596 614. Taylor and Francis
- Koerbin, P. (2013). “Archiving online election campaigns.” National Library of Australia, Web Archiving Blog, 09 August, 2013 <https://www.nla.gov.au/australias-web-archives/2013/08/09/archiving-online-election-campaigns>
- Meyer, E., Arthur T. and Schroeder, R. (2011). Web Archives: The Future(s), Report for the Oxford Internet Institute for the International Internet Preservation Consortium (IIPC), June 30, 2011. SSRN.
- Graham, S. and Hogan, J. (2014). “An Examination of Seán Gallagher’s Presidential Campaign in a Hybridized Media Environment.” Irish Communication Review, vol. 14, no. 1, pp 30-47. ARROW@DIT <http://arrow.dit.ie/icr/vol14/iss1/3>
- O’Malley, E. (2012). “Explaining the 2011 Irish Presidential Election: Culture, Valence, Loyalty or Punishment?” Irish Political Studies, vol 27, no.4, pp. 635-655. Taylor and Francis
- Pennock, M. (2013). Web-Archiving. DPC Technology Watch Report, March, 2013. Digital Preservation Coalition in association with Charles Beagrie Ltd. <http://www.dpconline.org/knowledge-base/tech-watch-reports>
- Schroeder, R. and Brügger, N. (2017). “Introduction: The web as history”, In: Eds. N. Brügger and R. Schroeder, The Web as history, London: UCL Press, pp. 1-19 <http://discovery.ucl.ac.uk/1542998/1/The-Web-as-History.pdf>
- Truman, G. (2016). Web Archiving Environmental Scan. Harvard Library Report, 2016. <https://dash.harvard.edu/handle/1/25658314>