There are five steps of moving from data to information: 1) data collection, 2) data organization, 3) data cleaning, 4) data analysis, and 5) data visualization. If using technology tools to facilitate data collection, it is important to ground your approach by considering your data collectors’ and target audience’s needs, motivations and limitations. Human-centered design exercises in the Co/Act toolkit21 can help you do this. The monitoring tools you use and surveys you develop will set the direction for how the monitoring initiative engages with its target audience, and will inform not only the indicators of data analysis but the communications strategy around the findings.

Data collection is what your monitoring team will do on a regular basis, using the monitoring tool that you design. After defining the target and parameters of the data collection and designing the research tools, as outlined in the monitoring preparation chapter above, the monitoring team will use those tools and the determined data sources to record the information and input it to a software program. This could be a simple spreadsheet solution like Google Sheets22, or it could involve more sophisticated software, such as SPSS Statistics.23 For most monitoring initiatives, though, at least in the early stages, sophisticated programs like SPSS are not required.

Tip: Monitoring groups sometimes use national surveys, simpler local surveys, or focus group discussions to collect data. CSOs and legislatures can use that information to shape policy and encourage direct citizen engagement. However, these tools entail some risk; see the Surveys and Selection Bias in the Types of Political Process Monitoring chapter above.

In adopting a monitoring tool, you should understand the difference between qualitative information and quantitative information. With qualitative data, you track behavior and practices. In quantitative data, you track satisfaction, frequency and availability. The analysis of qualitative data is more somatic, whereas analysis of quantitative data is more statistical.

Monitoring tools to gather qualitative information are made up of open-ended questions rather than just yes/no or ranking questions. Such tools can provide you with longer and more subjective answers. Focus groups or interview guides, which lay out planned questions with space for interviewers to record notes, are helpful to facilitate qualitative data collection. In particular, nondirective interviewing can be a helpful qualitative data-gathering technique. In nondirective interviewing, the interviewer does not frame questions in terms of right or wrong answers or limited sets of options and avoids leading the interviewee to answer in particular ways or within particular value systems. Instead, the interviewer uses an open-ended approach to explore the interviewee’s thoughts, attitudes and beliefs. Monitoring tools to gather quantitative data are made up of closed-ended questions, typically those with answers of yes, no or maybe, as well as rankings.

You should also understand the difference between primary data sources and secondary data sources. Primary data includes direct observations, whereas secondary data includes previously conducted research studies. You must gather primary data yourself with questionnaires and interviews, whereas secondary data can be gathered from sources such as reports produced by the government, international organizations or other non-governmental organizations; official statistics; or media articles.

Your approach to data collection will depend on the type of data you collect and how it is collected. Only once you’ve determined this will it make sense to look at what technologies can be used to facilitate data entry, management and analysis. Some tools are more useful with collecting structured data, but would not be helpful with free-form data that you might collect through qualitative interviews. Other tools — such as data mining tools — are more helpful in getting information from secondary sources. In some cases, digital tools might not be needed at all.

Data collection tools can be divided into three categories: data repositories, online forms and data mining tools. First, data repositories like Airtable24 or Google Sheets25 — basically, spreadsheets — are used to organize structured data. Information can be entered directly into the data repositories in a tabular format. However, this is not very user friendly and can lead to messy datasets, especially if more than one person is working on the database or there is no standardized format.

Second, online forms make it easier to input data into the repository in a structured format. More structured data facilitates analysis and visualization. Simple online forms, like Survey Monkey26 or Google Forms,27 can be used to limit human error by validating entries, by, for example, only allowing valid email addresses to be entered into an email field. Advanced survey tools that take advantage of modern technologies, like smartphones, can even collect data passively to validate data or simply collect additional data for analysis. For example, smartphones with geolocation services can collect data about the location of the person collecting the data and the time they collected it, in a way that would be hard to falsify.

Sometimes it is necessary for political process monitors to collect information directly from their research subjects, such as with opinion surveys, crowdsourcing or through the work of trained field-based observers. Crowdsourcing or crowdmapping tools like Fix My Community28 can help improve government service delivery through citizen reporting. Monitoring efforts that involve field-based observers can take advantage of structured data collection tools like Apollo29 to collect observation reports in real time. In both cases, it is important to consider how the medium might impact the collected data. For example, an online survey of a community will be biased toward individuals with internet access (see Surveys and Selection Bias in the Types of Political Process Monitoring section for more information). If field-based observers will be in areas without internet access, phone calls or SMS may be preferable to online forms.

Third, data mining tools take publicly available data and use it for analysis. Some platforms and websites enable direct downloads of structured data via application programming interfaces (APIs) or really simple syndication (RSS) feeds. Pulling data from APIs is in most cases legal and ethical, since the API data is deliberately regulated by platforms and structured so as not to violate users’ rights. In cases where data is only available in formats that cannot be readily downloaded and analyzed, such as information on websites or in PDFs, you might need to make use of data scraping tools. Web scraping tools like 0archive,30 for example, can be used to extract data from websites or social media platforms and convert it into structured data sets. Web scraping is not regulated by platforms and may violate their terms of service or even be illegal in some contexts. Tools like DocumentCloud31 or Adobe Acrobat32 can help extract data from PDFs or image files. Amazon Textract33 can also extract relationships or structure, which can help automate the process of organizing data extracted from PDFs into a structured data set. While these tools are helpful when dealing with massive amounts of data, for smaller monitoring projects, manual data entry might actually be more efficient.

Tip: Take precautions to ensure the security of the data collection tool and the privacy of the data, particularly when using online tools to collect data that involves sensitive or personal identifiable information. In some contexts, as with the European Union’s General Data Protection Regulations,34 data protection or localization laws must be followed when ​​collecting and processing this personal data. Investigate the applicable legal framework and collect only the minimum data necessary for your monitoring efforts. Regardless of your monitoring plan, take some time to assess and mitigate potential physical, digital and information risks that could arise as a result of your work, and reference resources like the NDI’s Cybersecurity Handbook for Civil Society Organizations’ chapter on “Creating your Organizational Security Plan35 to work through a risk assessment process and access security resources and tips as needed.

Most software-as-a-service platforms have Terms of Service and Privacy Policy pages that explain users’ rights and how the company handles personal information. The section on Communicating and Storing Data Securely36 in NDI’s Cybersecurity Handbook For Civil Society Organizations has additional information on this subject.

Data organization and cleaning occur together and are about categorizing and structuring your information, to, e.g., rank or compare the data, etc. After you have adopted your monitoring tool and collected data through observation and other means, you have to organize and clean that data to turn it into useful information. This is done by finding and highlighting patterns that are analyzed from a contextually relevant perspective. Especially when working with secondary data, multiple datasets may need to be collected from more than one source in order to get a full understanding of the political process you are monitoring. Merging these datasets involves matching and reconciling related fields and reconciling duplicate references. It is highly recommended that you work with a data scientist who has experience working with the type of data you collect. How you organize and clean the data will impact the types of questions you are able to answer during the data analysis process.

It may take hundreds or thousands of data items gathered through monitoring tools and access to existing data to generate information that is useful as evidence of whether a political process is working as expected. Data may come from institutions, actions and behaviors, as well as from citizens’ perceptions. You may have both quantitative and qualitative data. That data must be organized and cleaned to generate information that can be useful for your purposes.

While quantitative data is inherently structured and lends itself to analysis, the organization of qualitative data is best performed through coding: going through the data, creating themes, and then analyzing that data through themes, without preconceived biases.

Data analysis is not only about aggregating information and producing statistics. Analysis is most helpful when you interpret what the information means and draw conclusions that can lead to actionable, evidence-driven recommendations. For example, does the analysis show some transparency? Does it indicate an institution’s lack of will or capacities? Is the public voice included in the political process, and have institutions created political space for citizen participation? Is there accountability to citizens and other institutions?

Data visualization is important in political process monitoring, just as it is in research, because it makes information easier to understand and can present a lot of information in a visual, concise way. Moreover, good visualization can be an effective storytelling tool to communicate what your data means in a compelling way. Charts, infographics, dashboards with analytics, and video clips with illustrations are all types of data visualization techniques. However, poorly done visualizations of data can actually do more harm than good, presenting data in a misleading way. For more complex datasets, graphic designers and data scientists may be needed to represent the data both accurately and intelligibly. For more simple visualizations, available tools make creating simple graphics possible for individuals without specific expertise. Tableau,37 Carto38 and Mapbox39 are particularly suited for datasets with location data. Infographics creation tools like Infogram,40 Canva41 and Piktochart42 make creating simple visualizations a snap. Google Sheets43 and AirTable44 (using extensions) even have some basic visualization features for creating charts and graphs from within the data repository tool.

Recommendations

  • Data collection can take longer than expected, so allow plenty of time to set up good systems and processes for data collection, entry, storage, management, analysis and reporting.
  • Consider outlining any assumptions about the target audience, their needs and motivations, and how they will engage with your monitoring findings. From there, build research questions to evaluate these assumptions and ground the monitoring initiative design and communications plan in an approach that will have the most impact on the target audience.
  • Consider doing an independent review of your methodology to check if your tool can provide quality data and to check that the information you are gathering will be essential for your goal.
  • Provide early feedback to your target audience on the initial findings and allow an opportunity to discuss these results. You can always tweak and revise your monitoring process as you go if it makes your initiative stronger.
  • Monitoring is often a long-term process, and the people involved can change over time, especially in the political sphere. The process should, therefore, be documented and filed in a location accessible to the appropriate people in your organization.

Footnotes

21 Matt Bailey, Priyal Bhatt and Caroline Sinders, “Co/Act: Human-Centered Design for Activists.”

22 “Google Sheets,” DemTools, updated July 28, 2021, https://dem.tools/guides-and-tools/google-sheets.

23 “IBM SPSS Statistics,” IBM, accessed October 29, 2022, https://www.ibm.com/products/spss-statistics.

24 “Airtable” DemTools, updated July 14, 2022, https://dem.tools/guides-and-tools/airtable.

25 “Google Sheets.”

26 “Survey Monkey.”

27 “Google Forms.”

28 “Fix My Community.”

29 “Apollo,” DemTools, updated December 9, 2021, https://dem.tools/guides-and- tools/apollo.

30 “Oarchive,” DemTools, updated September 22, 2021, https://dem.tools/guides-and-tools/0archive.

31 “Document Cloud,” DemTools, updated August 30, 2021, https://dem.tools/guides-and-tools/document-cloud.

32 “Adobe Acrobat,” Adobe, accessed October 27, 2022, https://www.adobe.com/acrobat.html.

33 “Amazon Textract,” Amazon Web Services, accessed October 27, 2022, https://aws.amazon.com/textract/.

34 “General Data Protection Regulation (GDPR),” National Democratic Institute, updated July 17, 2018, https://www.ndi.org/gdpr.

35 “Creating Your Organizational Security Plan,” in Cybersecurity Handbook for Civil Society Organizations, National Democratic Institute, accessed October 5, 2022, https://cso.cyberhandbook.org/topics.

36 “Communicating and Storing Data Securely,” in Cybersecurity Handbook For Civil Society Organizations, National Democratic Institute, last updated July 2022, https://cso.cyberhandbook.org/topics/storing-data/communications-and-sharing-data.

37 “Tableau,” DemTools, updated July 28, 2021, https://dem.tools/guides-and-tools/tableau.

38 “Carto,” DemTools, updated July 28, 2021, https://dem.tools/guides-and-tools/carto.

39 “Mapbox,” DemTools, updated October 28, 2021, https://dem.tools/guides-and-tools/mapbox.

40 “Infogram,” DemTools, updated July 28, 2021, https://dem.tools/guides-and-tools/infogram.

41 “Canva,” DemTools, updated July 28, 2021, https://dem.tools/guides-and-tools/canva.

42 “Canva,” DemTools, updated July 28, 2021, https://dem.tools/guides-and-tools/canva.

43 “Google Sheets.”

44 “AirTable.”