Package restxsl :: Module uniquote
[frames | no frames]

Module restxsl.uniquote

Apply SmartyPants-style filtering to a string, transforming the magic character sequences to their Unicode equivalents. This is in contrast to SmartyPants which transforms character sequences to HTML entity references.

This code is based on the smartypants.py module by Chad Miller.

Author: Michael Alyn Miller <malyn@strangeGizmo.com>

Copyright: 2006 by Michael Alyn Miller

License: BSD License (see source code for full license)

Function Summary
  transformDashes(text)
Transform ASCII dashes into Unicode em and en dashes.
  transformEllipses(text)
Transform ASCII ellipses into Unicode ellipsis characters.
  transformQuotes(text)
Transform ASCII quote characters (" and ') into curly quotes.

Variable Summary
SRE_Pattern RE_ELLIPSIS = \.\.\.
unicode RE_ELLIPSIS_SUB = u'\u2026'
SRE_Pattern RE_EMDASH = ---
unicode RE_EMDASH_SUB = u'\u2014'
SRE_Pattern RE_ENDASH = --
unicode RE_ENDASH_SUB = u'\u2013'

Function Details

transformDashes(text)

Transform ASCII dashes into Unicode em and en dashes.

Two dashes (--) are converted into an en dash:
>>> transformDashes('Wait 10--15 minutes.')
u'Wait 10\u201315 minutes.'
Three dashes (---) are converted into an em dash:
>>> transformDashes('Wait---or not.')
u'Wait\u2014or not.'

transformEllipses(text)

Transform ASCII ellipses into Unicode ellipsis characters.

Three dots (...) are converted into an ellipsis:
>>> transformEllipses('Wait for it...')
u'Wait for it\u2026'

transformQuotes(text)

Transform ASCII quote characters (" and ') into curly quotes.

Double quotes are converted into the correct, Unicode curly quotes:
>>> transformQuotes('"Hello there"')
u'\u201cHello there\u201d'

>>> transformQuotes("I 'believe' you.")
u'I \u2018believe\u2019 you.'
Single quotes embedded in double quotes are parsed correctly:
>>> transformQuotes('Did he just say, "I \'believe\' you?"')
u'Did he just say, \u201cI \u2018believe\u2019 you?\u201d'
Apostrophes are turned into curly single quotes:
>>> transformQuotes("He's got some nerve.")
u'He\u2019s got some nerve.'
The sequence 's at the beginning of a string usually indicates a possessive phrase that was split due to some sort of formatting. For example, the string "<i>Bill</i>'s Noodle House" would be split into "Bill" and "'s Noodle House".
>>> transformQuotes("He")
'He'

>>> transformQuotes("'s got some nerve.")
u'\u2019s got some nerve.'

>>> transformQuotes("'spose so.")
u'\u2018spose so.'
Decade abbreviations get special handling:
>>> transformQuotes("Dot-coms?  Yeah, I remember the '90s.")
u'Dot-coms?  Yeah, I remember the \u201990s.'

Variable Details

RE_ELLIPSIS

Type:
SRE_Pattern
Value:
\.\.\.                                                                 

RE_ELLIPSIS_SUB

Type:
unicode
Value:
u'\u2026'                                                              

RE_EMDASH

Type:
SRE_Pattern
Value:
---                                                                    

RE_EMDASH_SUB

Type:
unicode
Value:
u'\u2014'                                                              

RE_ENDASH

Type:
SRE_Pattern
Value:
--                                                                     

RE_ENDASH_SUB

Type:
unicode
Value:
u'\u2013'                                                              

Generated by Epydoc 2.1 on Wed Jul 12 11:20:41 2006 http://epydoc.sf.net