Welcome to PyStrExt’s documentation!¶
Quickstart
To install:
pip install pystrext
Usage:
>>> import pystrext as strext
>>> h,m,s = strext.extracts('elapse time : 5h 7m 36s','(\d+)h (\d+)m (\d+)s',['0']*3)
>>> h,m,s
('5', '7', '36')
...
Functions index
MB_GB(str[, MB, GB]) | Transforms a string representing a memory MegaByte value into GigaBytes if value > 1024 |
MHz_GHz(str) | Transforms a string representing a frequency from Mhz to Ghz if value > 1024 |
base62_decode(str) | decode a base62 encoded number |
base62_encode(i) | encode an integer into base62 |
compress(s) | Compress a string with gzip |
extract(str, pattern) | Search a pattern and return the first group of the first match. |
extracts(str, pattern, default) | Search a pattern and return groups of the first match. |
file_unicode(f) | Convert a filename (unicode, utf-8 or iso-8859-1) into unicode |
file_unicode_list(l) | Convert filenames into unicode strings inside a list |
get_col(str, **kwargs) | Extract a column from a string |
if_val(val, str1, str2) | Select one string depending on a value |
is_email_valid(email) | Check string is a correct email address |
is_ip_valid(address) | Check string is a correct IP address (ipv4) |
no_one_many(n, str0, str1, str2) | Select one string depending on a n equal to 0,1 or many |
plural(str1, str2, n) | Select one string depending on a numeric value |
random_password([length]) | Generate an “easy to memorize” random password |
remove_accents(s) | Remove accents |
slugify(value) | Convert string to a slug |
truncate(str, maxsize[, max_end_str, ...]) | Truncate a string to a specified length and add a suffix (if truncated only) |
uncompress(s) | Uncompress a gzipped string |
version_lt(v1, v2) | Compare 2 strings representing dotted version number |
vjust(str[, level, delim, bitsize, fillchar]) | Justify a string representing dotted version number |
Functions documentation
- pystrext.MB_GB(str, MB='MB', GB='GB')¶
Transforms a string representing a memory MegaByte value into GigaBytes if value > 1024
One can specify his own unit
Parameters: - str (str) – string representing a value to be devided by 1024 if > 1024
- MB (str) – string for the lowest unit
- GB (str) – string for the highest unit
Returns: The converted string
Return type: str
Examples
>>> MB_GB('512') >>> '512 MB' >>> MB_GB('2048') >>> '2 GB' >>> MB_GB('2048','MBytes','GBytes') >>> '2 GBytes'
- pystrext.MHz_GHz(str)¶
Transforms a string representing a frequency from Mhz to Ghz if value > 1024
Parameters: str (str) – string representing the frequency (Mhz) Returns: The converted string Return type: str Examples
>>> MHz_GHz('377') >>> '377 MHz' >>> MHz_GHz('2048') >>> '2.048 GHz'
- pystrext.base62_decode(str)¶
decode a base62 encoded number
Parameters: str (str) – base62 string Returns: decoded integer Return type: int Examples
>>> base62_decode('GW') >>> 1024
- pystrext.base62_encode(i)¶
encode an integer into base62
This can be usefull when encoding an item id to build a short url : see url shorteners like http://pack.li
Parameters: i (int) – the number to convert Returns: base62 encoded string Return type: str Examples
>>> base62_encode(1024) >>> 'GW'
- pystrext.compress(s)¶
Compress a string with gzip
It could be useful to compress some data without creating a file, this function do that.
Parameters: s (str) – The strings to compress Returns: The compressed string Return type: str Examples
>>> s = "monty python" * 80 >>> len(s) 960 >>> c = compress(s) >>> len(c) 41 >>> u = uncompress(c) >>> len(u) 960 >>> s == u True
- pystrext.extract(str, pattern)¶
Search a pattern and return the first group of the first match.
The pattern must include a group selection, ie : it must include parentheses. Only the part inside the parentheses will be returned.
Parameters: - str (str) – The string to search a pattern
- pattern (RegexObject or str) – A regular expression object or a string for the pattern to search
Returns: The extracted strings that matches the pattern or None if no match.
Return type: str
Examples
>>> extract('the full monty python','(\w+) python') 'monty' >>> r=re.compile('>([^<]*)<') >>> extract('this is text form : >the answer<',r) 'the answer'
- pystrext.extracts(str, pattern, default)¶
Search a pattern and return groups of the first match.
The pattern must include a group selections, ie : it must include parentheses. Only the part inside the parentheses will be returned.
Parameters: - str (str) – The string to search a pattern
- pattern (RegexObject or str) – A regular expression object or a string for the pattern to search
Returns: The extracted string for each group that matches the pattern or default argument if no match.
Return type: list
Examples
>>> h,m,s = extracts('elapse time : 5h 7m 36s','(\d+)h (\d+)m (\d+)s',['0']*3) >>> h,m,s ('5', '7', '36')
- pystrext.file_unicode(f)¶
Convert a filename (unicode, utf-8 or iso-8859-1) into unicode
Parameters: f (str) – a filename Returns: converted filename Return type: unicode
- pystrext.file_unicode_list(l)¶
Convert filenames into unicode strings inside a list
Parameters: l (list) – a filename list Returns: converted filename list Return type: list
- pystrext.get_col(str, **kwargs)¶
Extract a column from a string
Some strings have got many columns seperated with a separator. Some strings may also have some sub-columns seperated with another separator. get_col() can extract one column/sub-column at any depth level. You have to specify one separator and one column number for each depth level you want to select.
Arguments name is important, it must be : col<n> and sep<n>.Arguments are sorted so sep1/col1 is searched before sep2/col2The separator can be a regular expression (by default will be ‘W+’)Parameters: - str (str) – The listing row to parse
- sep1 (str) – sperator 1
- col1 (str) – column number 1
- sepn (str) – sperator n
- coln (str) – column number n
Returns: The column/sub-column requested
Return type: str
Examples
>>> get_col(" 4 0 95 0 0 0| 0 72k| 352k 40k| 0 0 | 435 138 ",col1=2,sep1='\|',col2=-1,sep2='\W+') '40k' >>> get_col("/a/b/c/basename.date.jpg",col1=-1,sep1='/',col2=1,sep2='\.') 'date'
- pystrext.if_val(val, str1, str2)¶
Select one string depending on a value
This function return str1 if val True, str2 if False.
str1 or str2 can be a string that may include ‘%(val)s’ : it will replaced by val value.str1 or str2 can be a callable : it will be called with val as argumentParameters: - val (any type) – value to test
- str1 (str or callable) – The string to return if val
- str2 (str or callable) – The string to return if not val
Returns: str1 or str2 or str1(val) or str2(val)
Return type: str
Examples
>>> print if_val(3,"Item(s) found : %(val)s","No item found") Item(s) found : 3 >>> print if_val(0,"Item(s) found : %(val)s","No item found") No item found
- pystrext.is_email_valid(email)¶
Check string is a correct email address
It just checks string syntax. Useful for form checking.
Parameters: address (str) – email address string to check Returns: True if email string syntax is correct. Return type: bool Examples
>>> is_email_valid('hello@world.com') >>> True >>> is_email_valid('hello@world') >>> False >>> is_email_valid('hello world') >>> False
- pystrext.is_ip_valid(address)¶
Check string is a correct IP address (ipv4)
It just checks IP string syntax. Useful for form checking.
Parameters: address (str) – IP address string to check Returns: True if IP string syntax is correct. Return type: bool Examples
>>> is_ip_valid('12.23.34.45') >>> True >>> is_ip_valid('12.23.34.345') >>> False >>> is_ip_valid('12.23.34') >>> False >>> is_ip_valid('12.23.34a.45') >>> False >>> is_ip_valid('12.23.34.45.56') >>> False
- pystrext.no_one_many(n, str0, str1, str2)¶
Select one string depending on a n equal to 0,1 or many
This function will return :
- str0 if n <= 0
- str1 if n == 1
- str2 if n >= 2
str0, str1 or str2 can be a string that may include ‘%(n)s’ : it will replaced by n value.str0, str1 or str2 can be a callable : it will be called with n as argumentParameters: - n (int) – value to test
- str0 (str or callable) – The string to return if n <= 0
- str1 (str or callable) – The string to return if n == 1
- str2 (str or callable) – The string to return if n >= 2
Returns: str0 or str1 or str2
Return type: str
Examples
>>> print no_one_many(0,"No item","One item","%(n)s items") No item >>> print no_one_many(1,"No item","One item","%(n)s items") One item >>> print no_one_many(36,"No item","One item","%(n)s items") 36 items
- pystrext.plural(str1, str2, n)¶
Select one string depending on a numeric value
Parameters: - str1 (str) – The string to return if n <= 1
- str2 (str) – The string to return if n > 1
- n (int) – the numeric value to test
Returns: str1 if n is 1 or less, str2 otherwise.
Return type: str
Examples
>>> n=1 >>> print "found %d %s" % (n,plural("item","items",n)) found 1 item >>> n=4 >>> print "found %d %s" % (n,plural("item","items",n)) found 4 items
- pystrext.random_password(length=8)¶
Generate an “easy to memorize” random password
Parameters: length (int) – Password length (default : 8) Returns: The random password Return type: str Examples
>>> random_password() >>> 'rixerutu'
- pystrext.remove_accents(s)¶
Remove accents
Parameters: s (str) – string to convert Returns: same string without any accent Return type: str Examples
>>> remove_accents('Et voilà !') >>> 'Et voila !'
- pystrext.slugify(value)¶
Convert string to a slug
Parameters: value (str) – the string to convert Returns: the slug Return type: str Examples
>>> slugify("he'l'lO Wörld !") >>> 'hello-world'
- pystrext.truncate(str, maxsize, max_end_str='...', end_str_inside=True)¶
Truncate a string to a specified length and add a suffix (if truncated only)
Parameters: - str (str) – The string to truncate
- maxsize (int) – string maximum size before truncating and adding a suffix
- max_end_str (str) – suffix to add when string has been truncated (‘...’ by default)
- end_str_inside (bool) – Tells whether the suffix is inside the truncated string so the final string length is no more than maxsize or is outside the truncated string so the final string length is no more than maxsize + suffix’s length
Returns: The truncated string
Return type: str
Examples
>>> truncate('hello world !',80) >>> 'hello world !' >>> truncate('hello world !',5) >>> 'he...' >>> truncate('hello world !',5,end_str_inside=False) >>> 'hello...' >>> truncate('hello world !',5,' <a href="?more">more...</a>',False) >>> 'hello <a href="?more">more...</a>' >>> truncate('hello world !',80,' <a href="?more">more...</a>',False) >>> 'hello world !'
- pystrext.uncompress(s)¶
Uncompress a gzipped string
You can uncompress both strings compressed with pystrext.compress() but also a .gz file that has been read with open() and read().
Parameters: s (str) – The gzipped string to uncompress Returns: The uncompressed string Return type: str Examples
see pystrext.compress()
- pystrext.version_lt(v1, v2)¶
Compare 2 strings representing dotted version number
The goal is to test whether a version string is older than another one. A version string looks like this : “1.45.2.1” It can have as many dot you want, but the strings between dots cannot be more than 10 chars long. Strings comparaison are done “dot by dot” from left to right and stops as soon as the comparaison is not equal.
Parameters: - v1 (str) – version1 string
- v2 (str) – version2 string
Returns: True is v1 older than v2
Return type: bool
Examples
>>> version_lt('1.2.0','1.12.0') >>> True >>> version_lt('1.2.0','1.1.9') >>> False >>> version_lt('1.43c','1.43f') >>> True >>> version_lt('1.2.5.6.7.8','1.2.5.6.7.9') >>> True
- pystrext.vjust(str, level=5, delim='.', bitsize=6, fillchar=' ')¶
Justify a string representing dotted version number
The goal is to justify/format a version string in a way it can be filtered by a SQL engine : Each substrings will get a fixed length so SQL string comparaison can be used.
Parameters: - str (str) – dotted version number
- level (int) – number max of version substrings
- delim (str) – separator (a dot by default)
- bitsize (int) – substrings max length
- fillchar (str) – the char used to fill the blanks
Returns: the justified string
Return type: str
Examples
>>> vjust('1.2') >>> ' 1. 2. . . . ' >>> vjust('1.12') >>> ' 1. 12. . . . ' >>> vjust('1.12',fillchar='0') >>> '000001.000012.000000.000000.000000.000000'