ARA Data Release: June 2021 to January 2026
- Haven King-Nobles

- 13 minutes ago
- 3 min read
As part of FWI's Alliance for Responsible Aquaculture (ARA) program, we collect data from the farms we work with in India to understand the conditions for farmed fishes and help improve their welfare. We are now publishing our most comprehensive dataset to date, covering data collected from the program's inception in June 2021 through January 31, 2026.
This release also marks a shift in how we share our data. For the first time, we are publishing on GitHub rather than Google Sheets, and we are releasing not just water quality data but also stocking and harvest records, enrolled pond information, and dropout data.
We are sharing this data in the hopes that it will be useful to other parties interested in improving animal welfare in pond aquaculture.
What’s Included
The dataset consists of four files:
Water quality measurements (11,433 records): Dissolved oxygen, pH, ammonia, temperature, turbidity, and other parameters collected during farm visits, along with corrective actions taken when values fall outside acceptable ranges.
Enrolled ponds (210 ponds): A snapshot of all ponds currently in the ARA program as of February 2, 2026, including pond characteristics, feed practices, and equipment.
Stocking and harvest events (1,461 records): Stocking densities, species, fish weights, harvest volumes, and pond preparation practices.
Dropouts (94 ponds): Ponds that have left the program over the year, including reasons for leaving.
Some key figures:
Total water quality measurements: 11,433
Unique ponds tracked: 305
Regions covered: Eluru and Nellore, Andhra Pradesh
Timeline: June 2021 – January 2026

Protecting Farmer Privacy
We work closely with the farmers in our program, and protecting their privacy is important to us. Before publishing, we removed all personally identifying information from the data, including farmer names, GPS coordinates, and village names. Pond IDs have been replaced with anonymized identifiers generated through a one-way hashing process, so there is no way to trace the published data back to individual farmers. These anonymized IDs are consistent across all four datasets, allowing researchers to link records across files without compromising privacy.
Some Disclaimers
Despite our extensive efforts in quality control and error correction, the data has some limitations:
There are potential sources of error in this data, including equipment calibration issues and human error during data collection. We have improved our methods over time, so older measurements are likely less accurate. In particular, we encourage viewers to take all 2021 and 2022 measurements—those in the first two years of our program—with a large grain of salt.
We particularly want to highlight a change in our dissolved oxygen (DO) measurement method, as we found it strongly influences out-of-range rates. In March 2024 (Eluru) and November 2024 (Nellore), we replaced our former water quality meter with the Winkler's Method for DO and a pH pen for pH. We consider Winkler's Method to be the gold standard for DO measurement, and since adopting it, we have observed notably fewer DO out-of-range instances.
We have also updated our ranges over the years, as we developed better methods and gained more knowledge. Specifically, we added and removed parameters from our monitoring and changed the required ranges for turbidity and ammonia. As a result, a measurement that was considered in range in 2021 may now be considered out-of-range.
The measures we use, such as water quality metrics, are proxies for fish welfare and are inherently imperfect.
You may notice many empty or "NA" values. These generally represent data that we deliberately chose not to collect for a given measurement type, not missing data. For example, we only measure turbidity in the morning and only suggest corrective actions when water quality is out of range.
Access the Data
The data is available on GitHub.
The repository includes a full data dictionary describing every column, as well as the processing scripts used to prepare the data. The data is released under a CC-BY-4.0 license.
We plan to continue publishing updated data periodically. If you have questions, comments, or would like to conduct any analysis, we encourage you to reach out.




Comments