Is Gibberish

This package finds out if a string is real language or just nonsense. it returns a boolean, and you can tweak its sensitivity or exclude certain words.

What is it good for?

It might be good for:

Checking which files have human readable names.
Checking if a string is good to be used as a title for an entry.
Comparing two dodgy strings and seeing which is better.

Examples

Simple example:

const isGibberish = require('is-gibberish')

console.log(isGibberish('Hello there!'))
// false

console.log(isGibberish('asfafasfasfafg'))
// true

Here's a more advanced example that finds filenames that make no sense:

const isGibberish = require('is-gibberish')
// Get your data, this can be anything, in this case it happens to be a large
//array filenames.
const filenames = require('./filenames')

const OPTIONS = {
  // These are totally English, silly robot.
  exclude: ['hmmmmmmmmmm', 'javascript-02', 'lolnoob'],
  // Play with this value to see changes in the returned array.
  sensitivity: 0.111
}

const badFilenames = filenames
  .map(filename => (isGibberish(filename, OPTIONS) ? filename : undefined))
  .filter(Boolean)

console.log(JSON.stringify(badFilenames, null, 2))
console.log(
  `There is a total of ${filenames.length} filenames
of which ${badFilenames.length} is gibberish`
)

You can find the above example from the examples dir of this repo. If you want to play with it, just clone this repo and run npm run example.

By default it returns a boolean, but it can be configured to return the score, also, if some more advanced comparing is needed:

const isGibberish = require('is-gibberish')

const OPTIONS = { returnScore: true }
const string1 = isGibberish('lolnoob', OPTIONS)
const string2 = isGibberish('lolwut', OPTIONS)

console.log('`string1` makes this much sense:', string1)
console.log('and `string2` this much:', string2)

API

isGibberish(input[, options])

Returns: boolean|number

input

Type: string

options.exclude

Type: array

Exclude from analysis and always return false. These are the exceptions that you know are not real words, but you’d still like to classify as real words.

options.sensitivity

Type: number
Default: 0.111

How sensitive the match is. Lower the number, the more forgiving it is. Through quasi experimental research done in my research laboratory, I'd say 0.111 is good value to start with.

How does it work?

This is just a thin convenience wrapper around languagedetect package.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
.gitignore		.gitignore
README.md		README.md
index.js		index.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
spec.js		spec.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Is Gibberish

What is it good for?

Examples

API

isGibberish(input[, options])

How does it work?

About

Releases

Packages

Languages

hilja/is-gibberish

Folders and files

Latest commit

History

Repository files navigation

Is Gibberish

What is it good for?

Examples

API

isGibberish(input[, options])

How does it work?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages