Paperminer
What is Paperminer?
PaperMiner system is an interactive web-based tool that is accessible using browsers that utilises text mining techniques for extracting the metadata information and visualising them interactively using google maps. It enables the ne-grained analysis of large-scale document collections and empowers scholars to group the concepts which helps to identify and distinguish long-term patterns in digitised newspaper articles semi-automatically. This is a generic system that can be implemented to suit the requirements of any application domain and not just limited to newspaper articles. You can access it using this link http://paperminer.org.au/ or http://203.101.226.97/
FGCR(Fine-grained Clustering via Ranking)
What is FGCR?
FGCR is a document (text) clustering algorithm that harnesses the scalability of a search engine to produce fast fine-grained clustering solution. FGCR uses the concept of most relevant clusters to improve the performance of the clustering algorithm when dealing with fine-grained clustering problems. Furthermore, it utilizes a new cluster representation loci to improve the computational efficiency and reduce the curse of dimensionality problem in high-dimensional data. FGCR is comprehensively tested with social media data, further detail can be found in the following paper [1].
Features
- Fast clustering algorithm for fine-grained and high-dimensional unstructured (text) data.
- Cached query results (recommended) as-well-as online query processing.
- Documents’ vector space model (VSM) generator from a search engine.
- Documents to query generator.
- Cluster bubble visualization.
- Disparity (DS) metric evaluation (see [1]).
Getting Started
- Installation and usage.
- Building a sparse VSM from a search engine.
- Generate (and saving) queries from the documents.
- Calculating DS metric.
- Cluster bubbles visualization.
- FGCR paper.
Installation and Requirements
FGCR is tested using Python 3.5 and depends on SciPy, Numpy, and other modules as listed in requirements.txt. Please refer to the documentation for further information. Download the code here.
Citing FGCR & Contact
If you are using any function from FGCR, please use the following citation in your (academic) work.
[1]. Sutanto, Taufik, and Richi Nayak. “Fine-Grained Document Clustering for Social Media Analytics”, (Submitted, under review), 2017.
- To report an issue with FGCR or to suggest a feature please use the issue tracker.
- To ask a question regarding usage of FGCR we encourage posting to StackOverflow using the “FGCR” tag.
- To interact with FGCR developers, visit the FGCR Gitter channel.
- Finally, if you need to get in touch for non-technical information or doing a benchmark for an academic paper and in need for our VSM (used in the paper), please send us an e-mail.
License
wDataMiner
What is wDataMiner?
- wDataMiner is a web based application owned by QUT Apply Datamining Group.
- wDataMiner aims to execute different algorithms on data sets
- Users sign up and will be approved by administrators.
- Users can upload their data as text files.
- Then they can execute algorithms on uploaded files.
- Contact us page has been implemented to allow users to communicate with Administrators.
- Click Log in if you need to login
[need to provide login access to application]
Social Media Scrapper
coming soon…
Sentiment analyser
coming soon…