This in situ data set of absorption coefficients by phytoplankton at the first eight Ocean Land Colour Imager (OLCI) bands (centred at 400 nm 412.5 nm, 442.5 nm, 490 nm, 510 nm, 560 nm, 620 nm, 665 nm, abbreviated as aph(400), aph(412), aph(443), aph(490), aph(510), aph(560), aph(620), and aph(665)) consists of different data sets gathered together from in situ measurements collected in open, coastal, and inland surface waters spread around the globe and covering the time from first data delivery by OLCI on S3A in May 2016 until November 2022 which were matched to Ocean Land Colour Imager on Sentinel-3A and -3B and used in the paper by Bracher et al. (2025). We only used the absorption coefficient data derived from measurements on discrete water samples to ensure a similar method procedure followed and a similar uncertainty. It includes publicly available data and newly collected, measured and analysed data sets from the Phytooptics group at the Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research (AWI, PI: Astrid Bracher) and Hellenic Centre for Marine Research (HCMR, PI: Andrew C. Banks). This collection was matched that in situ data points had to fall within the 3x3 OLCI FR pixel box and a time window of + 12 hours which followed established community protocols (IOCCG 2018) and particularly EUMETSAT's OLCI matchup protocol (EUMETSAT 2022). Firstly, a pre-processing for quality control and a conversion of the considered in situ data to a common format following Valente et al. (2022) was performed. We flagged and disregarded the following data from the final quality-controlled data set which had (1) unrealistic or missing date or geographic coordinate fields, (2) poor quality (e.g., original flags) or method of observation that did not meet the criteria for the dataset (e.g., not defined in the community protocols (IOCCG 2018, 2019a, 2019b), and (3) spuriously high or low data. For the last item, the following limits were imposed: [0.0001–10] m−1 for aph(443). OLCI pixels were discarded when flagged with the recommended flags in (EUMETSAT 2022), and the remaining matchups were only considered valid if more than 50% of satellite pixels were available at remote sensing reflectance centred at band 560 nm (Rrs(560), e.g., 5 out of 9 for the 3x3 criterion) per an in situ data point, and a coefficient of variation <0.2. Dedicated matchup software developed by EUMETSAT was used to ensure that the validation process followed the established guidelines, ThoMaS (the Tool to generate Matchups of OC products with S3 OLCI https://gitlab.eumetsat.int/eumetlab/oceans/ocean-science-studies/ThoMaS). In situ data from Valente22 (see details on data sets below) were already provided at the nominal OLCI band 443 nm. All other aph(λ) data were provided in hyperspectral resolution (1nm, 2nm or around 3.3 nm resolution). Following Zibordi et al. (2023), these hyperspectral absorption coefficients were transformed to the nominal OLCI bands by averaging over the specific bandwidth. The OLCI matchup data, based on their associated RRS data at the first eight OLCI bands, were assigned to the specific optical water classes (OWCs) according to the Mélin & Vantrepotte (2015) classification. This contains 17 OWCs which range from very turbid to (OWC 1) oligotrophic to very clear waters (OWC 17). The OWC is also delivered for each matchup point (if the assignment fails the field contains "NaN". We provide also for OLCI the standard deviation of the OLCI matchup data to a in situ data point within the 3x3 pixels. For the in situ data we provide the estimate of the uncertainty for each matchup point further described in Bracher et al. (2025).
The different data compilations are described as follows:- Bracher22: This collection contains hyperspectral aph(λ) data published in Pangaea from AWI (Liu et al. 2019, Bracher and Liu 2021, Bracher et al. 2021a, 2021b) matching the S3A and S3B mission time and considered for coupled model evaluation in Alvarez et al. (2022). This data set encompasses data from four Atlantic expeditions (2016-2019: PS103, PS107, PS113, PS121) covering polar, temperate, tropical and shelf seas. - Castagna22: This collection from Castagna et al. (2022) contains spectral absorption data (matching all absorption products validated in this exercise) published in Pangaea which have been measured from water samples of many campaigns in 2017-2019 in Belgian waters.- Valente22: This collection contains a global marine in situ data compilation for ocean colour validation extracted from many data repositories (e.g., SEABASS, PANGAEA, BODC) and published in Valente et al. 2022. All hyperspectral aph(λ) data for the life time of OLCI data, excluding data also contained in Bracher22 (n=304), were compiled.- Röttgers23: From the Röttgers et al. (2023) large IOP data compilation measured during campaigns in the German Bight and adjacent regions from 2008-2021, we selected the data from S3 mission lifetime. This data set encompasses hyperspectral aph(λ) from two RV Heincke North Sea campaigns (HE488 and HE517 in late spring 2017 and late summer 2018, respectively) and aCDOM(λ) additionally from the LP2021 German Bight campaign in summer 2021.- SEABASS: From the hyperspectral aph(λ) data submissions to SeaBASS (https://seabass.gsfc.nasa.gov/, download 28 September 2023) we used all data overlapping S3A and S3B missions not contained in Valente22. This published data comprises mainly US waters (campaigns: CARBON_ESTUARIES, PLUMES_AND_BLOOMS, SFMBON) and the US ArcticCC expedition in the Northern Bering Sea in 2022. - AODN: Here we included new IOP data submissions to the Australian Open Access to Ocean Data portal (AODN, https://portal.aodn.org.au/, download 19 July 2023), not provided in Valente et al. (2022) or Lehmann et al. (2023) and matching the S3A and S3B OLCI lifetime. AODN-1 contains hyperspectral aph(λ) data from the CSIRO (Commonwealth Scientific and Industrial Research) Hydrochemistry Facility Integrated Marine Observing System (IMOS, https://research.csiro.au/hydrochemistry/projects/integrated-marine-observing-system-imos/). The data are from several expeditions in Australian waters (at Torres Strait in 2016, at the mouth of the Fitzroy River in 2017, and at the Coral Sea and Queensland Shelf in 2016 (IN2016) and 2020 (IN2020) and from the Lucinda Jetty Coastal Observatory (https://researchdata.edu.au/imos-srs-satellite-observatory-ljco/476837). - Banks-new: These are hyperspectral aph(λ) data from the under-sampled oligotrophic Eastern Mediterranean. These were collected by HCMR (PI: A. Banks) and the Joint Research Centre (JRC) on a joint optics cruise (HCMR-JRC OPTICS) in April to May 2022. This data set is not included in Zibordi et al. (2023), but follows the same measurement procedure.- Bracher-new: AWI (PI: A. Bracher) has conducted recently (January 2020 until November 2022) four more large expeditions spread over the temperate and polar Atlantic Ocean (MSM93, PS126, PS131 and PS133-1) and three weekly campaigns at Germany's largest inland water, Lake Constance (BS-1, BS-3, BS-4) where about 800 valid measurements for hyperspectral aph(λ) have been collected. The measurement protocol is the same as described in Liu et al. (2018).