fork download
  1. import re
  2. s = "Copyright © 2019 Apple Inc. All rights reserved.\r\n© 2019 Quid, Inc. All Rights Reserved.\r\n© 2009 Database Designs \r\n© 2019 Rediker Software, All Rights Reserved\r\n©2019 EVOSUS, INC. ALL RIGHTS RESERVED\r\n© 2019 Walmart. All Rights Reserved.\r\n© Copyright 2003-2019 Exxon Mobil Corporation. All Rights Reserved.\r\nCopyright © 1978-2019 Berkshire Hathaway Inc.\r\n© 2019 McKesson Corporation\r\n© 2019 UnitedHealth Group. All rights reserved.\r\n© Copyright 1999 - 2019 CVS Health\r\nCopyright 2019 General Motors. All Rights Reserved.\r\n© 2019 Ford Motor Company\r\n©2019 AT&T Intellectual Property. All rights reserved.\r\n© 2019 GENERAL ELECTRIC\r\nCopyright ©2019 AmerisourceBergen Corporation. All Rights Reserved.\r\n© 2019 Verizon\r\n© 2019 Fannie Mae\r\nCopyright © 2018 Jonas Construction Software Inc. All rights reserved.\r\nAll Comments © Copyright 2017 Kroger | The Kroger Co. All Rights Reserved\r\n© 2019 Express Scripts Holding Company. All Rights Reserved. 1 Express Way, St. Louis, MO 63121\r\n© 2019 JPMorgan Chase & Co.\r\nCopyright © 1995 - 2018 Boeing. All Rights Reserved.\r\n© 2019 Bank of America Corporation. All rights reserved.\r\n© 1999 - 2019 Wells Fargo. All rights reserved. NMLSR ID 399801\r\n©2019 Cardinal Health. All rights reserved.\r\n© 2019 Quid, Inc All Rights Reserved.\r\n602-226-2389 ©2019 Endurance International Group.\r\nCopyright 1999 — 2019 © Iflexion. All rights reserved.\r\nISO 9001:2008, ISO/ IEC 27001:2005 © Mobikasa 2019\r\n© 2019 Copyright arcadia.io.\r\n2018 © Power Tools LLC\r\nCopyright 2019 ComputerEase Construction Software | 1-800-544-2530\r\n© 2019 3M. 3M Health Information Systems Privacy Policy"
  3. rx = r'''(?xi)
  4. (?:© # Start of a group: © symbol
  5. (?:\s* # Start of optional group: 0+ whitespaces
  6. (?:\d{4} # Start of optional group: 4 digits
  7. (?:\s*[-—–]\s*\d{4})? # 0+ spaces, dashes, spaces, 4 digits
  8. )? # End of group
  9. \s*Copyright # Spaces and Copyright
  10. )? # End of group
  11. | # OR
  12. Copyright
  13. (?:\s* # Start of optional group: 0+ whitespaces
  14. (?:\d{4} # Start of optional group: 4 digits
  15. (?:\s*[-—–]\s*\d{4})? # 0+ spaces, dashes, spaces, 4 digits
  16. )?\s*© # End of group, 0+ spaces, ©
  17. )? # End of group
  18. ) # End of group
  19. (?:\s*\d{4}(?:\s*[-—–]\s*\d{4})?)? # Optional group, 9999 optionally followed with dash enclosed with whitespaces and then 9999
  20. \s* # 0+ whitespaces
  21. ( # Start of a capturing group:
  22. .*? # any 0+ chars other than linebreak chars, as few as possible, up to...
  23. (?=\s*[.|]| # 0+ spaces and then | or ., or
  24. \W*All\s+rights\s+reserved) # All rights reserved with any 0+ non-word chars before it
  25. | # or
  26. .*\b # any 0+ chars other than linebreak chars, as many as possible
  27. )'''
  28.  
  29. for m in re.findall(rx, s):
  30. print(m)
Success #stdin #stdout 0.02s 27824KB
stdin
Standard input is empty
stdout
Apple Inc
Quid, Inc
Database Designs
Rediker Software
EVOSUS, INC
Walmart
Exxon Mobil Corporation
Berkshire Hathaway Inc
McKesson Corporation
UnitedHealth Group
CVS Health
General Motors
Ford Motor Company
AT&T Intellectual Property
GENERAL ELECTRIC
AmerisourceBergen Corporation
Verizon
Fannie Mae
Jonas Construction Software Inc
Kroger
Express Scripts Holding Company
JPMorgan Chase & Co
Boeing
Bank of America Corporation
Wells Fargo
Cardinal Health
Quid, Inc
Endurance International Group
Iflexion
Mobikasa 2019
arcadia
Power Tools LLC
ComputerEase Construction Software
3M