Measuring Improvement in Access to Complete Data in Healthcare Collaborative Database Systems
1. Nurul Emran, Fathin N. M. Leza and Noraswaliza Abdullah
Computational Intelligence Technologies (CIT),
Universiti Teknikal Malaysia Melaka, Malaysia
Measuring Improvement in Access to
Complete Data in Healthcare
Collaborative Database Systems
2. “Stay Valued, Visible and Vibrant”
RESEARCH HIGHLIGHTS
Accessing complete data is crucial especially in
healthcare domain, however, within multi data
providers context, accessing complete data is a
challenge because not only data must be collected
and integrated, we must also seek collaborative
effort among the data providers
• This paper presents the result of evaluating Collaborative
Integrated Database System (COLLIDS) in terms of the degree
of improvement in accessing complete data it offers for data
providers.
• Statistical evaluation using Wilcoxon Sign Rank test is used to
verify the significant improvement on the access of complete data
sets.
• Our experiment is collected from 106 clinics (ranged from the
year 2012 to 2014) which are visited by a university’s staffs and
their dependents (stas family members) .The sample data
collected consists of patient treatment information (i.e. staffID, staff
name, patient name, diagnosis received, diseases, and drugs).
• population-based completeness (PBC) is adopted as a
measure.
• In this research, we argue that COLLIDS will improve access to complete data.
We hypothesized that there will be an increment in the ratio of completeness of
data accessible by data providers (who participate in the collaborative system).
• Therefore, we evaluate COLLIDS in terms of increment of access to complete
data for all data providers. Even though data integration solution will usually
improve data completeness, studies on whether this will always be the case
is limited (Refer data completeness cases where data integration is not of
benefit in some cases)
• This research aims to show how significant is the ratio
(especially in the case where improvement on data
completeness is present)
We analyze data completeness cases in order to
determine the type of data providers under measure.
Three types of relationship for the population of interest
are : 1) superset-subset relationship 2) subset-overlap
relationship 3) disjoint-overlap relationship.
3. “Stay Valued, Visible and Vibrant”
RESULTS AND CONCLUSIONS
Conclusions:
1. COLLIDS is of benefit for most data providers as increment of more than 50% data completeness can be observed in the results.
2. COLLIDS has been shown as useful as improvement on access to complete data is significant. The result indicates that none of
the data providers (clinic) falls under Case 1 (a data provider as the superset of other data providers). The clinics under measure
are more likely to be data providers with characteristics defined in Case 2 and or Case 3. As none of the clinics has complete
coverage on their patients data set and thus, collaborative integration advocated by COLLIDS will benefit all participants (data
providers).
Test Groups Normality Test
Null Hypothesis (H0): The distributions
of data for both groups under measure
are normal.
Alternative Hypothesis (H1): The
distributions of data for both groups
under measure are not normal.
Wilcoxon Test
Null Hypothesis (H0): COLLIDS gives no effect on
improvement of access to complete data.
Alternative Hypothesis (H1): COLLIDS improves access to
complete data.
Group 1: the set of all healthcare
providers denoted as As-is
completeness.
(before COLLIDS is implemented, where
access to patient records in other
participating clinics are restricted
The results :
Refer Figure 2 and Figure 3.
• Significantly not normal (for p < 0.05) for
both groups.
• Normality tests for both groups are
significant, thus the null hypothesis (H0)
is rejected.
• The results support H1 indicates that the
distributions of data in both groups
are not normal. The results suggest that
non-parametric test method (such as
Wilcoxon-test) must be used for further
test to quantify the difference between
the two groups of data (whether
increment in access to complete data is
significant after COLLIDS is
implemented).
The results :
In Figure 4:
1. In positive ranks row, the number of observations (N) is 106,
which represents the Completeness Increment after applying
COLLIDS is greater than As-IsCompleteness. The result in
positive ranks row indicates that all clinics have a positive
increment in access to complete data.
2. In negative ranks row, the number of observations (N) is zero,
which referred as the Completeness Increment after applying
COLLIDS is less than As-Is Completeness. This means that
there is no clinic that has increment of completeness less
than the As-Is completeness.
In Figure 5:
Wilcoxon Signed Rank test achieved to reveal there is a significant
difference between both groups of data where (p=0.000) for
signi_cance level (p<0.01). Thus, we accept hypothesis (H1) and
reject the null hypothesis ( H0).
Group 2: the set of increment values of
completeness
for all 106 clinics denoted as
Completeness Increment. (the amount of
improvement (in percentage) in
accessing complete patient records
after COLLIDS is implemented).
4. “Stay Valued, Visible and Vibrant”
REFERENCES
1. Sun, Y.G.: Access control method based on multi-level security tag for distributed database system. In: Proceedings of 2011 International Conference on Electronic & Mechanical
Engineering and Information Technology. Volume 5. (2011) 2509–2512
2. Ashtaputre, N., Bhutkar, S., Patil, P., Sathe, H.: Data access and retrieval for portable devices. In: Proceedings of the 8th International Conference on Computer Science and Education,
ICCSE 2013. (2013) 357–360
3. Dahiya, N., Kant, C.: Biometrics security concerns. In: Proceedings - 2012 2nd International Conference on Advanced Computing and Communication Technologies, ACCT 2012. (2011)
297–302
4. Hara, T., Madria, S.: Data replication for improving data accessibility in ad hoc networks. Mobile Computing, IEEE Transactions on 5 (2006) 1515–1532
5. Kumar, A., Bhattacharya, I., Bhattacharya, J., Maskara, S., Kung, W.m., Wang, Y.c., Chiang, I.J.: Deploying cloud computing to implement electronic health record in Indian healthcare
settings. Open journal of mobile computing and cloud computing 1 (2014) 35–47
6. Leza, F.N.M., Emran, N.A.: Data accessibility model using QR code for lifetime healthcare records. World Applied Sciences Journal 30 (2014) 395–402
7. Naumann, F., Freytag, J.C., Leser, U.: Completeness of integrated information sources. Information Systems Journal 29 (2004) 583–615
8. Emran, N.A.: Data completeness measures. In: Advances in Intelligent Systems and Computing. Volume 355. (2015) 117–130 9.
9. Nutt, W., Razniewski, S.: Completeness of queries over SQL databases. In: Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM
’12. (2012) 902
10. Alkharboush, N., Li, Y.: A decision rule method for assessing the completeness and consistency of a data warehouse. In: Proceedings - 2010 IEEE/WIC/ACM International Conference
on Web Intelligence and Intelligent Agent Technology Workshops, WI-IAT 2010. (2010) 265–268
11. Zaniolo, C.: Database relations with null values. Journal of Computer and System Sciences 28 (1984) 142–166
12. Wilcoxon, F.: Individual comparisons of grouped data by ranking methods. Journal of economic entomology 39 (1946) 269
13. Bellera, C.A., Marilyse, J., Hanley, J.A.: Normal Approximations to the Distributions of the Wilcoxon Statistics : Accurate to What N ? Graphical Insights. Journal of Statistics
Education 18 (2010) 1–17 14. Rose, S., Spinks, N., Canhoto, A.I.: Management Research: Applying the Principles. (2015)
14. Emran, N.A. et al., 2013. Measuring Data Completeness for Microbial Genomics Database. In ACIIDS 2013 Part 1. Lecture Notes in Computer Science. Springer.
15. Emran, N.A. et al., 2013. Reference Architectures to Measure Data Completeness across Integrated Databases. In ACIIDS 2003 Part 1. Springer-Verlag Berlin Heidelberg, pp. 216–225.
16. Emran, N.A., Embury, S. & Missier, P., 2014. Measuring Population-Based Completeness for Single Nucleotide Polymorphism (SNP) Databases. In J. Sobecki, V. Boonjing, & S.
Chittayasothorn, eds. Advanced Approaches to Intelligent Information and Database Systems. Cham: Springer International Publishing, pp. 173–182.
17. Emran, N., Embury, S. & Missier, P., 2008. Model-driven component generation for families of completeness. In 6th International Workshop on Quality in Databases and Management
of Uncertain Data, Very Large Databases (VLDB).