Python built-in string method (Collector only)

5

String processing is a very common skill, but Python built-in string method is too many, often forgotten, in order to facilitate quick reference, according to Python 3.5.1 to write an example of each built-in method and categorize, for everyone to index.
PS: You can click the green header in the overview to enter the corresponding category or quickly index the corresponding method through the right sidebar article catalog.

overview

String case conversion

  • str.capitalize()

  • str.lower()

  • str.casefold()

  • str.swapcase()

  • str.title()

  • str.upper()

String format output

  • str.center(width[, fillchar])

  • str.ljust(width[, fillchar]); str.rjust(width[, fillchar])

  • str.zfill(width)

  • str.expandtabs(tabsize=8)

  • str.format(^args, ^^kwargs)

  • str.format_map(mapping)

String search location and substitution

  • str.count(sub[, start[, end]])

  • str.find(sub[, start[, end]]); str.rfind(sub[, start[, end]])

  • str.index(sub[, start[, end]]); str.rindex(sub[, start[, end]])

  • str.replace(old, new[, count])

  • str.lstrip([chars]); str.rstrip([chars]); str.strip([chars])

  • static str.maketrans(x[, y[, z]]); str.translate(table)

Union and segmentation of strings

  • str.join(iterable)

  • str.partition(sep); str.rpartition(sep)

  • str.split(sep=None, maxsplit=-1); str.rsplit(sep=None, maxsplit=-1)

  • str.splitlines([keepends])

String conditional judgement

  • str.endswith(suffix[, start[, end]]); str.startswith(prefix[, start[, end]])

  • str.isalnum()

  • str.isalpha()

  • str.isdecimal(); str.isdigit(); str.isnumeric()

  • str.isidentifier()

  • str.islower()

  • str.isprintable()

  • str.isspace()

  • str.istitle()

  • str.isupper()

String encoding

  • str.encode(encoding=”utf-8″, errors=”strict”)

toggle case

str.capitalize()

To capitalize an acronym, it is important to note that if the acronym is not capitalized, the original string is returned.

'adi dog'.capitalize()
# 'Adi dog'

'abcd Xu'.capitalize ()# 'Abcd Xu '''Xu abcd'.capitalize ()# 'Xu abcd''ß'.capitalize()
# 'SS'

str.lower()

Converts strings to lowercase, only pairs.ASCII The encoded letters are valid.

'DOBI'.lower()
# 'dobi'

'ß'.lower()   # 'ß' For German lowercase letters, they have another kind of lowercase.'ss', lower Method cannot be converted# 'ß'

'Xu ABCD'.lower ()# 'Xu abcd'

str.casefold()

The string is converted to lowercase, and all the corresponding lowercase forms in the Unicode encoding will be converted.

'DOBI'.casefold()
# 'dobi'

'ß'.casefold()   #The German medium and small letter mother is equivalent to the lowercase letter SS, which is capitalized as SS.# 'ss'

str.swapcase()

Inverts the capitalization of string letters.

'Xu Dobi A123'.swapcase ()#: 'Xu dOBI A123 SS'here is converted into SS, which is a capital letter.

But what we need to pay attention to iss.swapcase().swapcase() == s Not necessarily true:

u'\xb5'
# 'µ'

u'\xb5'.swapcase()
# 'Μ'

u'\xb5'.swapcase().swapcase()
# 'μ'

hex(ord(u'\xb5'.swapcase().swapcase()))
Out[154]: '0x3bc'

Here'Μ'(The lowercase of Mu is not M.'μ' The way of writing is consistent.

str.title()

Capitalize the initial word of each word in the string. It is based on blanks and punctuation, so it is wrong to skim possessive cases or abbreviations in some English capitals.

'Hello world'.title()
# 'Hello World'

'Chinese ABC def 12gh'.title ()# 'Chinese Abc Def 12Gh'# But this method is not perfect:"they're bill's friends from the UK".title()
# "They'Re Bill'S Friends From The Uk"

str.upper()

Changing all the letters of the string to uppercase automatically ignores characters that cannot be converted to capitals.

'Chinese ABCdef 12gh'.upper()
# 'Chinese ABC DEF12GH'

What we need to pay attention to iss.upper().isupper() Not necessarily forTrue

String format output

str.center(width[, fillchar])

Displays the string in the center of a given width, filling in the extra length for a given character, and returns the original string if the specified length is less than the string length.

'12345'.center(10, '*')
# '**12345***'

'12345'.center(10)
# '  12345   '

str.ljust(width[, fillchar]); str.rjust(width[, fillchar])

Returns a string of the specified length, left (right) of the string content, if the length is less than the length of the string, then returns the original string, defaults to fill the ASCII space, you can specify the filled string.

'dobi'.ljust(10)
# 'dobi      '

'dobi'.ljust(10, '~')
# 'dobi~~~~~~'

'dobi'.ljust(3, '~')
# 'dobi'

'dobi'.ljust(3)
# 'dobi'

str.zfill(width)

Fill the string with’0’and return the string of the specified width.

"42".zfill(5)
# '00042'
"-42".zfill(5)
# '-0042'

'dd'.zfill(5)
# '000dd'

'--'.zfill(5)
# '-000-'

' '.zfill(5)
# '0000 '

''.zfill(5)
# '00000'

'dddddddd'.zfill(5)
# 'dddddddd'

str.expandtabs(tabsize=8)

Replaces the horizontal tab with the specified space so that the spacing between adjacent strings is kept within the specified number of spaces.

tab = '1\t23\t456\t7890\t1112131415\t161718192021'

tab.expandtabs()
# '1       23      456     7890    1112131415      161718192021'
# '123456781234567812345678123456781234567812345678'  Note the relationship between the number of blanks and the output position above.Tab.expandtabs (4)
# '1   23  456 7890    1112131415  161718192021'
# '12341234123412341234123412341234'  

str.format(^args, ^^kwargs)

The syntax for formatting strings is quite diverse. Official documents already have more detailed examples. No examples are written here. Children’s shoes you want to know can be directly stamped here in Format examples.

str.format_map(mapping)

Similarstr.format(*args, **kwargs) ,The difference ismapping It’s a dictionary object.

People = {'name':'john', 'age':56}

'My name is {name},i am {age} old'.format_map(People)
# 'My name is john,i am 56 old'

String search location and substitution

str.count(sub[, start[, end]])

text = 'outer protective covering'

text.count('e')
# 4

text.count('e', 5, 11)
# 1

text.count('e', 5, 10)
# 0

str.find(sub[, start[, end]]); str.rfind(sub[, start[, end]])

text = 'outer protective covering'

text.find('er')
# 3

text.find('to')
# -1

text.find('er', 3)
Out[121]: 3

text.find('er', 4)
Out[122]: 20

text.find('er', 4, 21)
Out[123]: -1

text.find('er', 4, 22)
Out[124]: 20

text.rfind('er')
Out[125]: 20

text.rfind('er', 20)
Out[126]: 20

text.rfind('er', 20, 21)
Out[129]: -1

str.index(sub[, start[, end]]); str.rindex(sub[, start[, end]])

Andfind() rfind() Similarly, the difference is that if it can not be found, it will trigger.ValueError

str.replace(old, new[, count])

'dog wow wow jiao'.replace('wow', 'wang')
# 'dog wang wang jiao'

'dog wow wow jiao'.replace('wow', 'wang', 1)
# 'dog wang wow jiao'

'dog wow wow jiao'.replace('wow', 'wang', 0)
# 'dog wow wow jiao'

'dog wow wow jiao'.replace('wow', 'wang', 2)
# 'dog wang wang jiao'

'dog wow wow jiao'.replace('wow', 'wang', 3)
# 'dog wang wang jiao'

str.lstrip([chars]); str.rstrip([chars]); str.strip([chars])

'  dobi'.lstrip()
# 'dobi'
'db.kun.ac.cn'.lstrip('dbk')
# '.kun.ac.cn'

' dobi   '.rstrip()
# ' dobi'
'db.kun.ac.cn'.rstrip('acn')
# 'db.kun.ac.'

'   dobi   '.strip()
# 'dobi'
'db.kun.ac.cn'.strip('db.c')
# 'kun.ac.cn'
'db.kun.ac.cn'.strip('cbd.un')
# 'kun.a'

static str.maketrans(x[, y[, z]]); str.translate(table)

maktrans It is a static method for generating a comparison table.translate Use.
Ifmaktrans If there is only one parameter, it must be a dictionary. The key of the dictionary is either a Unicode encoding (an integer) or a string of length 1. The dictionary value can be any string.NoneOr Unicode encoding.

a = 'dobi'
ord('o')
# 111

ord('a')
# 97

hex(ord('Dog ')# '0x72d7'

b = {'d':'dobi', 111:' is ', 'b':97, 'i':'\u72d7\u72d7'}
table = str.maketrans(b)

a.translate(table)
# 'dobi is aDog dog '

Ifmaktrans With two parameters, the two parameters form a mapping, and the two strings must be equal in length; if there is a third parameter, the third parameter must also be a string, which is automatically mapped toNone

a = 'dobi is a dog'

table = str.maketrans('dobi', 'alph')

a.translate(table)
# 'alph hs a alg'

table = str.maketrans('dobi', 'alph', 'o')

a.translate(table)
# 'aph hs a ag'

Union and segmentation of strings

str.join(iterable)

An iterated object that connects elements to strings with the specified string.

'-'.join(['2012', '3', '12'])
# '2012-3-12'

'-'.join([2012, 3, 12])
# TypeError: sequence item 0: expected str instance, int found

'-'.join(['2012', '3', b'12'])  #bytes Non stringTypeError: sequence item2: expected str instance, bytes found

'-'.join(['2012'])
# '2012'

'-'.join([])
# ''

'-'.join([None])
# TypeError: sequence item 0: expected str instance, NoneType found

'-'.join([''])
# ''

','.join({'dobi':'dog', 'polly':'bird'})
# 'dobi,polly'

','.join({'dobi':'dog', 'polly':'bird'}.values())
# 'dog,bird'

str.partition(sep); str.rpartition(sep)

'dog wow wow jiao'.partition('wow')
# ('dog ', 'wow', ' wow jiao')

'dog wow wow jiao'.partition('dog')
# ('', 'dog', ' wow wow jiao')

'dog wow wow jiao'.partition('jiao')
# ('dog wow wow ', 'jiao', '')

'dog wow wow jiao'.partition('ww')
# ('dog wow wow jiao', '', '')



'dog wow wow jiao'.rpartition('wow')
Out[131]: ('dog wow ', 'wow', ' jiao')

'dog wow wow jiao'.rpartition('dog')
Out[132]: ('', 'dog', ' wow wow jiao')

'dog wow wow jiao'.rpartition('jiao')
Out[133]: ('dog wow wow ', 'jiao', '')

'dog wow wow jiao'.rpartition('ww')
Out[135]: ('', '', 'dog wow wow jiao')

str.split(sep=None, maxsplit=-1); str.rsplit(sep=None, maxsplit=-1)

'1,2,3'.split(','), '1, 2, 3'.rsplit()
# (['1', '2', '3'], ['1,', '2,', '3'])

'1,2,3'.split(',', maxsplit=1),  '1,2,3'.rsplit(',', maxsplit=1)
# (['1', '2,3'], ['1,2', '3'])

'1 2 3'.split(), '1 2 3'.rsplit()
# (['1', '2', '3'], ['1', '2', '3'])

'1 2 3'.split(maxsplit=1), '1 2 3'.rsplit(maxsplit=1)
# (['1', '2 3'], ['1 2', '3'])

'   1   2   3   '.split()
# ['1', '2', '3']

'1,2,,3,'.split(','), '1,2,,3,'.rsplit(',')
# (['1', '2', '', '3', ''], ['1', '2', '', '3', ''])

''.split()
# []
''.split('a')
# ['']
'bcd'.split('a')
# ['bcd']
'bcd'.split(None)
# ['bcd']

str.splitlines([keepends])

The string is divided into lists by line spacing as a separator.keepends byTrue,After splitting, the row boundary character is preserved, and the recognized line boundary can be seen in the official document.

'ab c\n\nde fg\rkl\r\n'.splitlines()
# ['ab c', '', 'de fg', 'kl']
'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
# ['ab c\n', '\n', 'de fg\r', 'kl\r\n']

"".splitlines(), ''.split('\n')      #Pay attention to the difference between them.# ([], [''])
"One line\n".splitlines()
# (['One line'], ['Two lines', ''])

String conditional judgement

str.endswith(suffix[, start[, end]]); str.startswith(prefix[, start[, end]])

text = 'outer protective covering'

text.endswith('ing')
# True

text.endswith(('gin', 'ing'))
# True
text.endswith('ter', 2, 5)
# True

text.endswith('ter', 2, 4)
# False

str.isalnum()

Any combination of strings and numbers is true, in short:

As long asc.isalpha()c.isdecimal()c.isdigit()c.isnumeric() Any one of them is true.c.isalnum() It’s true.

'dobi'.isalnum()
# True

'dobi123'.isalnum()
# True

'123'.isalnum()
# True

'Xu'.isalnum ()BeTrue

'dobi_123'.isalnum()
# False

'dobi 123'.isalnum()
# False

'%'.isalnum()
# False

str.isalpha()

Unicode Character databases, unlike Alphabetic, are true as letters (which generally have “Lm,” “Lt,” “Lu,” “Ll,” “or”Lo,”and so on).

'dobi'.isalpha()
# True

'do bi'.isalpha()
# False

'dobi123'.isalpha()
# False

'Xu'.isalpha ()BeTrue

str.isdecimal(); str.isdigit(); str.isnumeric()

The difference between the three methods is that the true value of Unicode universal identifier is different.

isdecimal: Nd,
isdigit: No, Nd,
isnumeric: No, Nd, Nl

digit Anddecimal The difference is that there are some numerical strings.digit But notdecimal ,Specifically poking here.

num = '\u2155'
print(num)
# ⅕
num.isdecimal(), num.isdigit(), num.isnumeric()
# (False, False, True)

num = '\u00B2'
print(num)
# ²
num.isdecimal(), num.isdigit(), num.isnumeric()
# (False, True, True)

num = "1"  #unicode
num.isdecimal(), num.isdigit(), num.isnumeric()
# (Ture, True, True)

num = "'Ⅶ'" 
num.isdecimal(), num.isdigit(), num.isnumeric()
# (False, False, True)

num = "Ten "num.isdecimal(), num.isdigit(), num.isnumeric()
# (False, False, True)

num = b"1" # byte
num.isdigit()   # True
num.isdecimal() # AttributeError 'bytes' object has no attribute 'isdecimal'
num.isnumeric() # AttributeError 'bytes' object has no attribute 'isnumeric'

str.isidentifier()

Determines whether a string can be a valid identifier.

'def'.isidentifier()
# True

'with'.isidentifier()
# True

'false'.isidentifier()
# True

'dobi_123'.isidentifier()
# True

'dobi 123'.isidentifier()
# False

'123'.isidentifier()
# False

str.islower()

'Xu'.islower ()BeFalse

'ß'.islower()   #German capital letterBeFalse

'aXu'.islower ()BeTrue

'ss'.islower()
# True

'23'.islower()
# False

'Ab'.islower()
# False

str.isprintable()

All characters in a string are printable or empty. Characters of the “Other” and “Separator” categories in the Unicode character set are non-printable (but excluding ASCII spaces (0x20)).

'dobi123'.isprintable()
# True

'dobi123\n'.isprintable()
Out[24]: False

'dobi 123'.isprintable()
# True

'dobi.123'.isprintable()
# True

''.isprintable()
# True

str.isspace()

Determines whether there is at least one character in the string and all characters are blank characters.

In [29]: '\r\n\t'.isspace()
Out[29]: True

In [30]: ''.isspace()
Out[30]: False

In [31]: ' '.isspace()
Out[31]: True

str.istitle()

To determine whether the characters in the string are initials, they ignore the non alphabetic characters.

'How Python Works'.istitle()
# True

'How Python WORKS'.istitle()
# False

'how python works'.istitle()
# False

'How Python  Works'.istitle()
# True

' '.istitle()
# False

''.istitle()
# False

'A'.istitle()
# True

'a'.istitle()
# False

'Toss Abc Def 123'.istitle ()BeTrue

str.isupper()

'Xu'.isupper ()BeFalse

'DOBI'.isupper()
Out[41]: True

'Dobi'.isupper()
# False

'DOBI123'.isupper()
# True

'DOBI 123'.isupper()
# True

'DOBI\t 123'.isupper()
# True

'DOBI_123'.isupper()
# True

'_123'.isupper()
# False

String encoding

str.encode(encoding=”utf-8″, errors=”strict”)

 
fname = 'Xu ''Fname.encode ('ascii')
# UnicodeEncodeError: 'ascii' codec can't encode character '\u5f90'...

fname.encode('ascii', 'replace')
# b'?'

fname.encode('ascii', 'ignore')
# b''

fname.encode('ascii', 'xmlcharrefreplace')
# b'徐'

fname.encode('ascii', 'backslashreplace')
# b'\\u5f90'

Reference material

Python Built in type string method

Leave a Reply

Your email address will not be published. Required fields are marked *