Analys av sökord som föreslagits av ett svenskt nätverk för klimatobstruktion

DOI

This data comprises data traces related to search queries used in climate obstruction. It is based on "klimatsans" (Climate Sense or Climate Reason; translated from Swedish, cf. Vowles & Hultman, 2021), a Swedish blog and network which exists since 2014 and runs a Swedish-language blog and submits opinion pieces and letters to the editor to various Swedish news outlets. The stated aims of the network amount to first-level obstruction, i.e. they reject the scientific consensus that increased atmospheric CO2 leads to climate change. The data concerns how the network throughout its various publications invite readers to “google” certain words (keyphrases). The data set includes: 1) all blog posts published on klimatsans.com from January 2014 to June 2022; 2) all hyperlinks from the blog; 3) tabulation, count, and coding of all search queries suggested in the blog, as identified by following after the Swedish imperative verb "googla"; 4) tabulation of all uses of 25 selected keyphrases in Swedish newspapers; 5) results of search engine results pages for these 25 queries from Google and DuckDuckGo (each run three times: in plain, in verbatim using quotation marks, and preceded by the term "googla") (original data available via Sünkler et al., 2023); 6) tabulation and coding of domains frequently targeted by hyperlinks and/or listed in search engine results pages. Furthermore, the data set includes some scripts for replication, an extensive README file for methodological additions, and details on coding schemes. The data was originally collected to investigate to trace data voids through the texts of their creators or proponents. This provides insights into how data voids are created, promoted, used, and if they do not disappear also abandoned.

Data gäller hur svenska klimatobstruktionsnätverket "klimatsans" i sina olika publikationer uppmanar läsarna att "googla" vissa ord (nyckelfraser). Detaljer i den engelska beskrivningen.

Queries were identified by scraping the entire blog, looking for the imperative verb "googla" (Swedish for "google!") followed by a keyphrase. This was assumed to constitute one query, which we then followed through Retriever's news database (all Swedish printed press), as well as the search engines Google and DuckDuckGo.Queries were identified by scraping the entire blog, looking for the imperative verb "googla" (Swedish for "google!") followed by a keyphrase. This was assumed to constitute one query, which we then followed through Retriever's news database (all Swedish printed press), as well as the search engines Google and DuckDuckGo.

Frågorna identifierades genom att scrapa hela bloggen och leta efter imperativverbet "googla" följt av en sökfras. Detta antogs utgöra en sökord, som vi sedan följde genom Retrievers nyhetsdatabas (all svensk tryckt press), samt sökmotorerna Google och DuckDuckGo.Frågorna identifierades genom att scrapa hela bloggen och leta efter imperativverbet "googla" följt av en sökfras. Detta antogs utgöra en sökord, som vi sedan följde genom Retrievers nyhetsdatabas (all svensk tryckt press), samt sökmotorerna Google och DuckDuckGo.

Total universe/Complete enumerationTotal universe/Complete enumeration

Hela populationen/total räkningHela populationen/total räkning

  1. On 1 August 2022, we used the software httrack to crawl the CON’s blog, retrieving 2654 posts. 2. We extracted 1943 hyperlinks from the retrieved blog posts. 3. We identified 268 occurrences of the term “googla” on 177 different blog posts since 2014. 4. We identified and tabulated all explicitly suggested keyphrases, i.e., those that follow an imperative verb and are quoted or follow a colon. 5. We coded the retrieved keyphrases according to their syntactical composition. Coding was carried out by the first author and validated by the second author. 6. We created a set of 25 keyphrases to use as seeds for further data creation. The set included all ten keyphrases that had been suggested at least four times, and added 15 strategically selected keyphrases used two or three times to increase variation. 7. We submitted the compiled keyphrases to the Swedish media database Retriever, yielding 240 results from Swedish print media. Of those, 204 asked readers to “google” the respective keyphrase. 8. We submitted the same keyphrases as queries to Google Search and DuckDuckGo, using the search retrieval analysis software RAT (Result Assessment Tool) to obtain the first SERP for each search engine as well as the HTML source code of results (Lewandowski et al., 2022; data available via Sünkler et al., 2023). We submitted (a) the suggested queries; (b) the suggested queries in quotation marks (i.e., verbatim search); and (c) the Swedish imperative form of “google” followed by the suggested keyphrase (no quotation marks). With a maximum of 10 results per query, but often fewer and sometimes no results for queries b and c, we obtained 146 SERPs and 1001 search results. 9. Of these, 249 results link to the CON’s blog, and further 236 results mention the CON or its authors—usually signed by the CON or linking to it. Few referred to the blog as engaged in climate obstruction (e.g. by debunking the CON’s claims); conversely, not all climate obstruction content in the data set mentions the CON. 10. Based on search results and hyperlinks, we classified 204 unique domains as frequent, i.e. they occurred in at least two SERPs, at least 10 hyperlinks, or at least 1 SERP and 4 hyperlinks. As these counts represent the possibility of finding a specific domain, we included duplicate targets in these counts. We coded these frequent domains regarding their site type and language. Coding was carried out by the first author and validated by the second author. For more details, see the included README file.1. On 1 August 2022, we used the software httrack to crawl the CON’s blog, retrieving 2654 posts.
  2. We extracted 1943 hyperlinks from the retrieved blog posts.
  3. We identified 268 occurrences of the term “googla” on 177 different blog posts since 2014.
  4. We identified and tabulated all explicitly suggested keyphrases, i.e., those that follow an imperative verb and are quoted or follow a colon.
  5. We coded the retrieved keyphrases according to their syntactical composition. Coding was carried out by the first author and validated by the second author.
  6. We created a set of 25 keyphrases to use as seeds for further data creation. The set included all ten keyphrases that had been suggested at least four times, and added 15 strategically selected keyphrases used two or three times to increase variation.
  7. We submitted the compiled keyphrases to the Swedish media database Retriever, yielding 240 results from Swedish print media. Of those, 204 asked readers to “google” the respective keyphrase.
  8. We submitted the same keyphrases as queries to Google Search and DuckDuckGo, using the search retrieval analysis software RAT (Result Assessment Tool) to obtain the first SERP for each search engine as well as the HTML source code of results (Lewandowski et al., 2022; data available via Sünkler et al., 2023). We submitted (a) the suggested queries; (b) the suggested queries in quotation marks (i.e., verbatim search); and (c) the Swedish imperative form of “google” followed by the suggested keyphrase (no quotation marks). With a maximum of 10 results per query, but often fewer and sometimes no results for queries b and c, we obtained 146 SERPs and 1001 search results.
  9. Of these, 249 results link to the CON’s blog, and further 236 results mention the CON or its authors—usually signed by the CON or linking to it. Few referred to the blog as engaged in climate obstruction (e.g. by debunking the CON’s claims); conversely, not all climate obstruction content in the data set mentions the CON.
  10. Based on search results and hyperlinks, we classified 204 unique domains as frequent, i.e. they occurred in at least two SERPs, at least 10 hyperlinks, or at least 1 SERP and 4 hyperlinks. As these counts represent the possibility of finding a specific domain, we included duplicate targets in these counts. We coded these frequent domains regarding their site type and language. Coding was carried out by the first author and validated by the second author. For more details, see the included README file.

Se engelska versionen och README-filen.Se engelska versionen och README-filen.

Content codingContent coding

Kodning av innehållKodning av innehåll

Identifier
DOI https://doi.org/10.5878/zb1v-ba15
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=1038516d3d3a27faee654eaafd42fdd1c9c2bb2be870c909971a541eeecfc4fd
Provenance
Creator Rödl, Malte
Publisher Swedish National Data Service; Svensk nationell datatjänst
Publication Year 2024
Rights Access to data through SND. Access to data is restricted.; Åtkomst till data via SND. Tillgång till data är begränsad.
OpenAccess false
Contact https://snd.se
Representation
Discipline Social Sciences
Spatial Coverage Sweden; Sverige