wikipedia

Improving the extraction of Wikipedia data

I am happy to share some recent performance results of a new parser for Wikipedia data dumps that I have developed over the past 2 months.

The new parser is also written in Python, as it was its predecessor included in WikiXRay. However, this new parser comes with notable improvements in speed and accuracy:

On the Analysis of Contributions from Privileged Users in Virtual Open Communities

Ortega, Felipe, Daniel Izquierdo-Cortazar, Jesus M. Gonzalez-Barahona, and Gregorio Robles. "On the Analysis of Contributions from Privileged Users in Virtual Open Communities." In 42th Hawaii International Conference on System Sciences (HICSS), 1-10. Waikoloa, Big Island, Hawaii , USA: IEEE Computer Society, 2009.