fork download
  1. import re
  2.  
  3. def preprocess(text):
  4. #remove all non-alpha characters but - between letters
  5. text = re.sub(r'[\W\d_](?<![^\W\d_]-(?=[^\W\d_]))', r' ', text)
  6. return " ".join(text.split())
  7.  
  8. print(preprocess("Attended pre-tender, etc meetings."))
  9.  
  10.  
Success #stdin #stdout 0.03s 9440KB
stdin
Standard input is empty
stdout
Attended pre-tender etc meetings