Introduction
In the dynamic landscape of the digital era, the emergence of advanced data collection and processing technologies has necessitated the development of comprehensive legal frameworks to protect individual privacy rights. In Europe, enacting the General Data Protection Regulation (GDPR) in 2018 marked a significant milestone, establishing an EU standard for data protection and ethical compliance (ICO 2018). The European Union enforced the GDPR by codifying a mandatory data protection baseline (Comite 2024). General Data Protection Regulation set the benchmark for responsible data management and catalyzed a worldwide re-evaluation of data protection norms (Tikkinen-Piri, Rohunen & Markkula 2018). In this context, the DataLawCompanion (DLC) tool emerges as a vital resource for disseminating data protection laws, aiming to balance innovation promotion with ethical compliance of data protection laws and to foster a paradigm shift in the management and perception of personal data.
The following East African countries – Kenya, Uganda, Rwanda, and Tanzania – have established data protection laws, while Burundi still needs a comprehensive legal framework (Ilori 2020). In the context of East Africa, particularly in countries like Kenya, Uganda, and Rwanda, the transformative impact of digital technologies has been recognised, leading to the creation of data protection laws tailored to the regional dynamics. The Data Protection Act in Kenya is important because it lays out directions and best practice regulations for companies and the government to follow on how to use personal data. If you fail to observe the timeline, according to the regulations, you face stiff penalties, i.e. 1% of your annual turnover, or five million kenya shilling, whichever is lower, as well as criminal sanctions. In addition, individual also have the right to try to find compensation from you for breach or loss of their personal data.1 In Uganda, the main purpose of DPPA is to protect the privacy of individuals and of personal data by regulating the collection and processing of personal information. The act also tries to provide for the rights of persons whose data is collected. Additionally, the Act applies to collection, possession, and utilizing personal data within Uganda and data collected outside Uganda relating to Ugandan citizens. Breach or violation of the Act and Regulations thereunder can lead to significant costs and risks for those involved. The possible consequences include damage to the reputation of the person, institution, or public body, and fines of up to two percent of the corporation’s annual gross turnover.2 The law is anticipated to help increase individuals’ confidence in Rwanda (Mutimukwe, Kolkowska & Gronlund 2019). When people are confident that their data is handled responsibly, they are more likely to engage with online services and share their information, this will drive economic growth and innovation in the country. Additionally, strict data privacy laws can facilitate international trade and data sharing. This is because countries with robust data protection laws are often deemed safe for cross-border data transfers, a requirement in today’s globalized economy. Failure to comply with the law may result in administrative fines on data controllers, data processors, and third parties.3 In this environment, the DataLawCompanion (DLC) tool emerges as a vital resource for disseminating data protection laws. The DLC seeks to balance the promotion of innovation with the ethical compliance of data protection laws. It aspires to foster a paradigm shift in the management and perception of personal data.
This paper introduces the DLC as a friendly, mobile, and online tool for navigating the complexities of data protection and ethics in our increasingly data-centric world. The tool is designed not only as a practical utility but as a companion in the journey towards more human-centred data practices. By enabling organisations to view compliance not just as a regulatory requirement but as a step towards greater maturity in data ethics, the DLC addresses the divide in attitudes towards data protection and legal compliance. The tool is reliable and ensures compliance with industry regulations (Gujar & Sing 2024). The paper also serves the dual purpose of outlining an existing gap and detailing a tailored solution to this identified necessity. It delves into the results derived from the analysis of questionnaires and the DataLawCompanion (DLC) tool, which is focused on fostering widespread awareness and ensuring adherence to Data Protection Laws. Central to this initiative is equipping businesses and individuals with the necessary knowledge to manage data legally and competently. It also presents the questionnaire analysis and results of the DLC tool that aim to create widespread awareness and ensure compliance with data protection laws (Qamar, Javed & Beg 2021). At its core, the initiative aims to empower businesses and individuals with the knowledge required to navigate the intricacies of data management responsibly.
The structure of this paper is as follows: Section 2 establishes the objectives, followed by the methodology in Section 3. Section 4 elucidates the results, including technological and business case descriptions. The paper concludes with a summary and recommendations in Sections 5 and 6, respectively.
Objectives
The primary goal of this paper is to demonstrate how the DLC tool empowers organisations, businesses, and individuals to comply effectively with data protection laws. We focus on transcending mere legal compliance by integrating ethical data management practices. Our study evaluates the DLC’s impact on enhancing understanding of Data Protection Laws among various stakeholders, aiming to establish a well-informed public adapted to navigating data management challenges responsibly. The approach adopted by the DLC tool extends beyond mere punitive measures; it aims to significantly enhance awareness and understanding of Data Protection Laws among various stakeholders.
This objective is not just about bridging the knowledge gap but about fostering a comprehensive grasp of data law protection, which is crucial in today’s data-driven environment. Moreover, the study seeks to assess the level of awareness regarding data protection laws among individuals and organisations and analyse how this awareness impacts their compliance behaviours and ethical standards in data handling. This objective is pivotal in understanding the broader implications of DLC’s integration into data management practices and its role in shaping more informed and ethically aware participants in the data landscape.
Methodologies
This study utilised a structured approach to gather data, employing a mixed-method approach. Using LLM means that the primary dataset consisted of the data protection acts of Kenya, Uganda, and Rwanda, forming the foundational material for the DLC tool, which was also used. Additional information was sourced through extensive online research to complement the dataset with relevant materials. The methodology was carefully designed to ensure the questions were as neutral and objective as possible, minimising potential response bias. The questionnaires were crafted to encompass all relevant areas of interest, aligning closely with the study’s objectives. A diverse sample of participants from various organisations and educational sectors was targeted. A snowball sampling technique was used, leveraging professional contacts within the research team and broader research group. This method ensured a wide-ranging and representative respondent pool. The Snowball method works where new study themes are used by existing study themes to form part of the sample. It is used where no sampling size can be established.
The questionnaires were administered online, ensuring confidentiality and respecting the participants’ privacy. The questionnaire responses were analysed in two distinct stages to accommodate the dual nature of the data collected. Quantitative analysis was applied to the closed-ended questions, providing statistical insights. In contrast, qualitative analysis was utilised for the open-ended questions, offering a deeper understanding of the respondents’ perspectives and experiences. Using a trained dataset, the researchers conducted subsequent studies to verify the results using LLM and ChatGPT-4. The training process included fine-tuning and iterative adjustments specifically tailored to align the model with the data protection acts of Kenya, Uganda, and Rwanda.
The performance of a Large Language Model (LLM) can be comprehensively evaluated if pre-training or fine-tuning is done. This would entail having a huge dataset or a domain-specific dataset. With a limited or no dataset, few-shot, zero-shot, or one-shot4 learning methods are convenient. There are several ways to train our models, like AWS (Amazon Web Service) Greengrass and AWS Lambda, which play an essential role in model training (Jithish, Mahalingam & Seng 2024). Due to the unavailability of a labelled summarisation and questions and answers dataset, the zero-shot learning approach is suitable. Open AI’s LLM, GPT 3.5 Turbo, is better suited to handle a zero-shot setting than pre-trained models.
In the case of summarisation, the study outlined summarisation on the legal decision dataset, curated by the Canadian Legal Information Institute, by prompt-based models such, as GPT 3.5 and GPT 4, resulting in better-quality summaries than pre-trained models. This is attributed to the fact that LLMs have an extensive architecture and are trained on a large corpus. The ROUGE (a Recall-oriented Understudy for Gisting Evaluation) score metric5 examines the lexical overlap between reference and generated summaries. A ROUGE assesses the output based on the summaries crafted by human experts, either reference summary or ground truth. It is designed to measure the similarity by comparing the system-generated summaries. GPT 3.5 Turbo outperforms the pre-trained models in the zero-shot setting on the legal dataset. It achieves 40.88 Rouge-1, 18.90 Rouge-2, and 37.63 Rouge-L scores (Xu & Ashley 2023).
Based on the study by Roegiest et al. (2023), the performance of LLM models in the question-and-answer task, with the different combinations of prompts and options on the partially structured response question, a ROUGE-1 of 0.397 and 0.422 was obtained respectively for GPT-3.5-Turbo and GPT-4. As a result of the findings, the researchers choose to use GPT 3.5 Turbo for summarisation and question-answering of the Data Protection Act documents of Kenya, Uganda, and Rwanda. Summarisation used included conducting quality and security checks, retiring and replacing documentation, and training employees (Determann 2024).
This mixed-method approach allowed for a comprehensive examination of the data, facilitating a balanced and thorough exploration of the subject matter. The methodology’s design and execution were integral in achieving the study’s aim of assessing the organisations’ awareness and compliance with Data Protection Laws.
The use of Large Language Models
The DLC uses a large language model (LLM), specifically GPT-4 Turbo for the question-answering task and GPT-3.5 Turbo for the summarisation task. ChatGPT-4, the specific LLM used in DLC, is notable for its advanced language understanding and generation capabilities. It can interpret a wide range of questions, from basic inquiries to more complex, nuanced ones, and provide detailed, informative answers. This feature is vital in data protection, where clarity and accuracy are paramount. The LLM’s effectiveness is illustrated in Figure 1.0, demonstrating its ability to dissect and respond to intricate legal questions with high precision. The LLM component of DLC, powered by ChatGPT-4, plays a pivotal role in the tool’s objective to democratise knowledge about Data Protection Laws. It acts as a dynamic, responsive resource that enhances users’ understanding and helps them navigate the often-complex landscape of data management and compliance (Sorrell 2024).

Figure 1.0
The design of the question-answer model.
We chose GPT-4 and GPT-3.5-Turbo as our Generative AI base models because they are state-of-the-art large language models that can produce high-quality and diverse text outputs. GPT-4 and GPT-3.5-Turbo have been trained on massive amounts of data, GPT-4 used 1.76 trillion parameters and GPT-3.5-Turbo used 175 billion parameters, which enables them to capture complex patterns and relationships among words, sentences, and topics. They can also adapt to different domains and tasks using their attention mechanism6 and large-scale pre-trained parameters. Using these models, we can leverage their powerful natural language generation capabilities to create novel and engaging content for various purposes and audiences. This model is adept at understanding and processing complex queries related to Data Protection Laws, offering precise, contextually relevant information. Unlike standard search engines, the LLM in DLC is tailored to handle the nuances and specifics of legal terminology and concepts, ensuring users receive accurate and comprehensive responses to their inquiries.
The DLC introduces an innovative legal summarisation tool integrated with the Large Language Model (LLM). This feature distorts intricate legal texts, encompassing various acts and legislative chapters, into clear and succinct summaries. Doing so significantly demystifies the often dense and complex legal jargon, rendering the legal landscape far more accessible and understandable to a broader audience. The legal summarisation tool leverages the capabilities of the LLM to parse and interpret extensive legal documents. It then effectively condenses this information into key points and summaries that retain the essence and critical details of the original texts. This process saves time and ensures that users without a legal background can grasp the fundamental aspects of legal documents without being overwhelmed by their complexity.
As illustrated in Figure 2.0, the legal summarisation module’s ability to provide concise, yet comprehensive, summaries is a testament to its utility in bridging the gap between legal experts and laypersons. By simplifying legal documents into more digestible formats, the tool enhances users’ understanding and engagement with legal content, fostering a more informed and legally aware society. This makes the Data Protection Laws more accessible and navigable for everyone, regardless of their legal expertise.

Figure 2.0
Legal summarisation component of the tool.
Interactive Chatbot: A user-engaging chatbot is deployed to interactively guide users through queries and concerns related to Data Protection Laws. The chatbot is a virtual assistant that offers real-time assistance and fosters a more personalised learning experience as shown in Figure 3.0.

Figure 3.0
a) Chatbot’s welcome message on the website; b) Chatbot’s home screen.
All these tools are seamlessly integrated to provide a centralised hub for individuals and businesses to access vital information on Data Protection Laws. This design prioritises simplicity and navigability, ensuring a smooth user experience.
Beyond the digital realm, the paper aims to adopt a multi-faceted approach to awareness creation. It employs engaging blogs, interactive workshops, and strategic collaborations with stakeholders, such as data protection regulators and universities. By leveraging these diverse channels, the project seeks to create a holistic ecosystem where legal compliance and informed decision-making become second nature, driving a culture of responsibility and ethical data management in the digital sphere.
Result and Discussion
The study attracted respondents from various backgrounds working in different sectors, especially education and industries as shown in the Figure 4.0. There are 15 surveys sent out and 13, 12, and 10 responded in Kenya, Uganda and Rwanda respectively. The respondents were selected based on their knowledge on the data protection acts on their respective countries The snowball sampling technique was used. This was important as it permitted a more balanced viewpoint of the attempts at data protection compliance and the issues faced by the users.

Figure 4.0
Occupation of respondent.
The respondents were then asked to determine the department in which they worked. This was to ascertain that respondents were from different fields. A larger number of respondents were in the Department of Technology compared to Finance and Administration, as shown in Figure 5.0.

Figure 5.0
Department/Field working.
Furthermore, the question asked about the respondents’ years of experience in their particular field. The majority of the respondents had between 0 and 5 years of experience in their workplace. This was important to determine their level of awareness of the Data Protection Act, as shown in Figure 6.0.

Figure 6.0
Work experience in years.
The respondents were asked to determine how often they encounter situations involving the processing of personal data in their profession. It was essential to establish whether those working in the organizations comply with the Data Protection Act. As shown in Figure 7.0 below, the majority of the respondents stated that they ‘very frequently’ encounter this.

Figure 7.0
Situations involving processing of personal data.
An additional question was asked about the extent of awareness of data protection laws in their countries. Most respondents were only moderately aware of the laws, and the others had the same percentage. This shows that the DLC tool is good as it would educate those unaware of the law. See Figure 8.0.

Figure 8.0
Awareness of the existence of data protection laws in the country.
To what extent do you think your business/organization prioritizes data protection and privacy? The majority of the respondents believed that their organisations moderately prioritise data protection and privacy, while quite a number also thought that they did not as seen in Figure 9.0 below.

Figure 9.0
Extent of Businesses/Organizations’ priorities towards data protection and privacy.
Most respondents felt that their organizations partially comply with data protection laws, as shown in Figure 10.0, because they encounter data protection laws in their day-to-day activities.

Figure 10.0
Organisation’s level of compliance with data protection laws.
Most respondents felt that having complex regulations made the industries where they work not comply with data protection. A good number believed that lack of awareness deteriorated compliance with data protection. A similar number also felt that resource constraints made complying with data protection laws difficult. Changing the legal landscape has also become a challenge in regard to compliance with data protection. The DLC needs to be implemented. This will create an awareness to all managers and employees this has depicted in Figure 11.0.

Figure 11.0
Challenges faced by organisations regarding compliance with data protection law.
When asked about measures their organization currently has to ensure data protection compliance, each respondent had varying sentiments regarding their organization’s compliance level, as shown in Table 1.0.
Table 1.0
Measures reported to ensure data protection compliance.
| What measures does your business/organisation currently have in place to ensure data protection compliance? |
|---|
| R1 Implementation of an Information Security Management System based on ISO 27011:2013 |
| R2 Awareness creation through reviews of the data protection laws |
| R3 All data is centralized |
| R4 None |
| R5 Regulations to access data |
| R6 We have taken measures |
| R7 Planned training staff on data protection laws |
| R8 At the moment, I can’t state since I am not in the management. Management has also not addressed us directly on data protection compliance |
| R9 Occasional departmental sensitisation through memos and brochures |
| R10 Data protection policy enactment on our applications. and also adhering to the GDPR standards |
| R11 Training and capacity building |
| R12 Not aware |
| R13 More training and mainstreaming data protection in firm operations |
When respondents were asked about the challenges faced in understanding and implementing data protection laws within organisations, the majority agreed that a lack of awareness among members hindered implementation. This is why our study has created a tool to ensure individuals know the data protection laws as seen in Figure 12.0.

Figure 12.0
Challenges faced in understanding and implementing data protection laws.
Large Language Model (LLM) results
Based on the study, the performance of LLM models in the question-and-answer task, with the different combinations of prompts and options on the partially structured response question, a ROUGE-1 of 0.397 and 0.422 was obtained, respectively, for GPT-3.5-Turbo and GPT-4, as shown in the Code Extract 1.0.

Code Extract 1.0
Conversational retrieval QA chain
Conclusion and Summary
Despite the uncertainty around the enactment of the Data Protection Acts, companies have received a mixed response regarding compliance in all three countries. Some significantly larger companies have successfully met the compliance deadlines, while others, primarily small and medium-sized enterprises (SMEs), need help with these requirements. Many industry employees and educators, who form most respondents, need more awareness about data protection laws. While these respondents believe compliance is achievable, they find it takes time to be practicable, with a greater emphasis on accurate system mapping rather than on the challenges of complying with the Act. Training and education are vital to increasing awareness and understanding of these laws. One possible source of skewing comes from the interview recruitment process. There was a risk that organisations willing to discuss experiences in data protection were those with a more positive perception of their managers in promoting compliance. Company managers who felt uneasy with their level of compliance could be unwilling to discuss the matter with third parties, mainly out of fear of the resulting repercussions or reputation damage. An incentive may help alleviate this problem, but it is unlikely to overcome any strong reluctance from potential interview candidates. Another potential skew in the recruitment process comes from most interviewees being within the professional and academic circles of the authors. As a result, the average level of security and privacy maturity of the organisations contacted may have needed to be more representative of typical organisations (Help net security 2018).
The study also reflects on the practical aspects of dealing with data protection laws, noting that existing tools are relevant but often need to be understood due to limited knowledge among users. Simplification and comprehensive training could enhance understanding and compliance. This was possible by introducing DLC. The DataLawCompanion (DLC) tool emerges as a vital resource for disseminating data protection laws. The DLC seeks to balance the promotion of innovation with the ethical compliance of data protection laws. It aspires to foster a paradigm shift in the management and perception of personal data. Overall, the study emphasises the importance of understanding compliance, mainly how some companies have successfully navigated these challenges and providing a model for smaller organisations to follow in developing mature data protection practices.
Recommendation
Compliance issues and lack of awareness are essential issues to consider. We recommend that organisations embrace clear, easily understandable language when dealing with online forms to avoid ambiguity, this tool is DLC. The tool search through into the results derived from the analysis of questionnaires and the DataLawCompanion (DLC) tool, which is focused on fostering widespread awareness and ensuring adherence to Data Protection Laws. Training should be conducted for members so that they are sufficiently aware of the issues. Simple tools should also be adopted to enable different groups of people to interact and use. It is viable for data protection law regulators to allow other companies with strong data protection measures to act as data service providers. The regulators must see the need for experts to explain compliance issues to individuals in different organisations. Nevertheless, the same privilege given to experts risks isolating smaller companies with constrained resources to compete effectively.
Notes
[4] Zero-shot learning, few-shot learning, and one-shot are all techniques that allow a machine learning model to make predictions for new classes with limited labelled data.
Acknowledgements
We thank the African Population and Health Research Center (APHRC), which hosts the Implementation Network for Sharing Population Information from Research Entities (INSPIRE), for providing us with a small grant to organize activities that led to the development of this Web application. The INSPIRE Network aspires to develop a data governance framework that facilitates collaborative evidence generation for policy impact. The authors (developers) received no direct funding to prepare this Web App.
Competing Interests
The authors have no competing interests to declare.
