Skip to content

Commit

Permalink
Updating the test case
Browse files Browse the repository at this point in the history
  • Loading branch information
rhnfzl committed Nov 13, 2024
1 parent a32ed73 commit fe7bd88
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 4 deletions.
13 changes: 11 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
# `SqueakyCleanText`
<div align="center">

[![PyPI](https://img.shields.io/pypi/v/squeakycleantext.svg)](https://pypi.org/project/squeakycleantext/) [![PyPI - Downloads](https://img.shields.io/pypi/dm/squeakycleantext)](https://pypistats.org/packages/squeakycleantext)
# SqueakyCleanText

[![PyPI](https://img.shields.io/pypi/v/squeakycleantext.svg)](https://pypi.org/project/squeakycleantext/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/squeakycleantext)](https://pypistats.org/packages/squeakycleantext)
[![Python package](https://github.com/rhnfzl/SqueakyCleanText/actions/workflows/python-package.yml/badge.svg)](https://github.com/rhnfzl/SqueakyCleanText/actions/workflows/python-package.yml)
[![Python Versions](https://img.shields.io/badge/Python-3.10%20|%203.11%20|%203.12-blue)](https://pypi.org/project/squeakycleantext/)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

A comprehensive text cleaning and preprocessing pipeline for machine learning and NLP tasks.
</div>

In the world of machine learning and natural language processing, clean and well-structured text data is crucial for building effective downstream models and managing token limits in language models.

Expand Down
12 changes: 10 additions & 2 deletions tests/test_sct.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,13 @@ def setUpClass(cls):
try:
with timeout(1200): # 20 minute timeout
config.CHECK_NER_PROCESS = False
# Initialize all the processing classes
cls.ProcessContacts = contact.ProcessContacts()
cls.ProcessDateTime = datetime.ProcessDateTime()
cls.ProcessSpecialSymbols = special.ProcessSpecialSymbols()
cls.NormaliseText = normtext.NormaliseText()
cls.ProcessStopwords = stopwords.ProcessStopwords()
cls.fake = Faker()
cls.fake = Faker() # Initialize Faker

# Override default models with smaller model for testing
test_models = ["dslim/bert-base-NER"] * 5 # Same small model for all languages
Expand Down Expand Up @@ -99,8 +100,15 @@ def setUpClass(cls):
raise

def setUp(self):
"""Set up test fixtures before each test method."""
config.CHECK_NER_PROCESS = True
# Use the class-level NER instance instead of creating a new one
# Copy class-level attributes to instance level
self.ProcessContacts = self.__class__.ProcessContacts
self.ProcessDateTime = self.__class__.ProcessDateTime
self.ProcessSpecialSymbols = self.__class__.ProcessSpecialSymbols
self.NormaliseText = self.__class__.NormaliseText
self.ProcessStopwords = self.__class__.ProcessStopwords
self.fake = self.__class__.fake
self.ner = self.__class__.ner

@settings(deadline=None)
Expand Down

0 comments on commit fe7bd88

Please sign in to comment.