Commit fde99bb7 authored by Maximilian Legnar's avatar Maximilian Legnar

added requirements.txt

parent fc9078de
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
This python project was created as part of the article "Natural Language Processing in diagnostic texts from This python project was created as part of the article "Natural Language Processing in diagnostic texts from
nephropathology". nephropathology".
The paper can be found [here](LINK). The paper can be found (soon) [here](LINK).
The scripts ```database_preparation/data_preparation_pipeline.py```, ```TextClustering/clustering_pipeline.py``` The scripts ```database_preparation/data_preparation_pipeline.py```, ```TextClustering/clustering_pipeline.py```
and ```TextClassification/classification_pipeline.py``` gives an idea of how this project can be used with other datasets. and ```TextClassification/classification_pipeline.py``` gives an idea of how this project can be used with other datasets.
...@@ -18,7 +18,7 @@ Feel free to use and adapt the scripts to your own needs. ...@@ -18,7 +18,7 @@ Feel free to use and adapt the scripts to your own needs.
## Requirements ## Requirements
For preprocessing, the project requires some nltk corporas: ```database_preparation/preprocess.py``` requires some nltk corporas:
``` ```
import nltk import nltk
nltk.download('stopwords') nltk.download('stopwords')
......
numpy==1.21.0
gensim==4.2.0
pandas==1.4.2
matplotlib==3.5.1
tqdm==4.64.0
scikit-learn==1.1.1
hdbscan==0.8.28
nltk==3.7
seaborn==0.11.2
validclust==0.1.1
tensorflow-gpu==2.6.0
wordcloud==1.8.2.2
joblib==1.1.0
scipy==1.7.3
yake==0.4.8
openpyxl==3.0.10
googletrans==3.1.0a0
datasets==2.3.2
transformers==4.21.0.dev0
dataclasses==0.8
pyarrow==8.0.0
keras==2.6.0
torch==1.11.0
hanta==0.2.0
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment