Qt 4.8
Classes | Public Types | Public Functions | Static Public Functions | Protected Functions | Static Private Functions | Static Private Attributes | Friends | List of all members
QTextCodec Class Referenceabstract

The QTextCodec class provides conversions between text encodings. More...

#include <qtextcodec.h>

Inheritance diagram for QTextCodec:
QBig5Codec QBig5hkscsCodec QCP949Codec QEucJpCodec QEucKrCodec QFontBig5Codec QFontBig5hkscsCodec QFontGb18030_0Codec QFontGb2312Codec QFontGbkCodec QFontJis0201Codec QFontJis0208Codec QFontKsc5601Codec QFontLaoCodec QGb18030Codec QIconvCodec QIsciiCodec QJisCodec QLatin15Codec QLatin1Codec QSimpleTextCodec QSjisCodec QTsciiCodec QUtf16Codec QUtf32Codec QUtf8Codec QWindowsLocalCodec

Classes

struct  ConverterState
 

Public Types

enum  ConversionFlag { DefaultConversion, ConvertInvalidToNull = 0x80000000, IgnoreHeader = 0x1, FreeFunction = 0x2 }
 

Public Functions

virtual QList< QByteArrayaliases () const
 Subclasses can return a number of aliases for the codec in question. More...
 
bool canEncode (QChar) const
 Returns true if the Unicode character ch can be fully encoded with this codec; otherwise returns false. More...
 
bool canEncode (const QString &) const
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.s contains the string being tested for encode-ability. More...
 
QByteArray fromUnicode (const QString &uc) const
 Converts str from Unicode to the encoding of this codec, and returns the result in a QByteArray. More...
 
QByteArray fromUnicode (const QChar *in, int length, ConverterState *state=0) const
 Converts the first number of characters from the input array from Unicode to the encoding of this codec, and returns the result in a QByteArray. More...
 
QTextDecodermakeDecoder () const
 Creates a QTextDecoder which stores enough state to decode chunks of char * data to create chunks of Unicode data. More...
 
QTextDecodermakeDecoder (ConversionFlags flags) const
 
QTextEncodermakeEncoder () const
 Creates a QTextEncoder which stores enough state to encode chunks of Unicode data as char * data. More...
 
QTextEncodermakeEncoder (ConversionFlags flags) const
 
virtual int mibEnum () const =0
 Subclasses of QTextCodec must reimplement this function. More...
 
virtual QByteArray name () const =0
 QTextCodec subclasses must reimplement this function. More...
 
QString toUnicode (const QByteArray &) const
 Converts a from the encoding of this codec to Unicode, and returns the result in a QString. More...
 
QString toUnicode (const char *chars) const
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.chars contains the source characters. More...
 
QString toUnicode (const char *in, int length, ConverterState *state=0) const
 Converts the first size characters from the input from the encoding of this codec to Unicode, and returns the result in a QString. More...
 

Static Public Functions

static QList< QByteArrayavailableCodecs ()
 Returns the list of all available codecs, by name. More...
 
static QList< int > availableMibs ()
 Returns the list of MIBs for all available codecs. More...
 
static QTextCodeccodecForCStrings ()
 Returns the codec used by QString to convert to and from const char * and QByteArrays. More...
 
static QTextCodeccodecForHtml (const QByteArray &ba)
 Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode. More...
 
static QTextCodeccodecForHtml (const QByteArray &ba, QTextCodec *defaultCodec)
 Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode. More...
 
static QTextCodeccodecForLocale ()
 Returns a pointer to the codec most suitable for this locale. More...
 
static QTextCodeccodecForMib (int mib)
 Returns the QTextCodec which matches the MIBenum mib. More...
 
static QTextCodeccodecForName (const QByteArray &name)
 Searches all installed QTextCodec objects and returns the one which best matches name; the match is case-insensitive. More...
 
static QTextCodeccodecForName (const char *name)
 Searches all installed QTextCodec objects and returns the one which best matches name; the match is case-insensitive. More...
 
static QTextCodeccodecForTr ()
 Returns the codec used by QObject::tr() on its argument. More...
 
static QTextCodeccodecForUtfText (const QByteArray &ba)
 Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and returns a QTextCodec instance that is capable of decoding the text to unicode. More...
 
static QTextCodeccodecForUtfText (const QByteArray &ba, QTextCodec *defaultCodec)
 Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and returns a QTextCodec instance that is capable of decoding the text to unicode. More...
 
static void setCodecForCStrings (QTextCodec *c)
 
static void setCodecForLocale (QTextCodec *c)
 Set the codec to c; this will be returned by codecForLocale(). More...
 
static void setCodecForTr (QTextCodec *c)
 

Protected Functions

virtual QByteArray convertFromUnicode (const QChar *in, int length, ConverterState *state) const =0
 QTextCodec subclasses must reimplement this function. More...
 
virtual QString convertToUnicode (const char *in, int length, ConverterState *state) const =0
 QTextCodec subclasses must reimplement this function. More...
 
 QTextCodec ()
 Constructs a QTextCodec, and gives it the highest precedence. More...
 
virtual ~QTextCodec ()
 Destroys the QTextCodec. More...
 

Static Private Functions

static bool validCodecs ()
 

Static Private Attributes

static QTextCodeccftr = 0
 

Friends

class QTextCodecCleanup
 

Detailed Description

The QTextCodec class provides conversions between text encodings.

Note
This class or function is reentrant.

Qt uses Unicode to store, draw and manipulate strings. In many situations you may wish to deal with data that uses a different encoding. For example, most Japanese documents are still stored in Shift-JIS or ISO 2022-JP, while Russian users often have their documents in KOI8-R or Windows-1251.

Qt provides a set of QTextCodec classes to help with converting non-Unicode formats to and from Unicode. You can also create your own codec classes.

The supported encodings are:

QTextCodecs can be used as follows to convert some locally encoded string to Unicode. Suppose you have some string encoded in Russian KOI8-R encoding, and want to convert it to Unicode. The simple way to do it is like this:

QByteArray encodedString = "...";
QString string = codec->toUnicode(encodedString);

After this, string holds the text converted to Unicode. Converting a string from Unicode to the local encoding is just as easy:

QString string = "...";
QByteArray encodedString = codec->fromUnicode(string);

To read or write files in various encodings, use QTextStream and its setCodec() function. See the Codecs example for an application of QTextCodec to file I/O.

Some care must be taken when trying to convert the data in chunks, for example, when receiving it over a network. In such cases it is possible that a multi-byte character will be split over two chunks. At best this might result in the loss of a character and at worst cause the entire conversion to fail.

The approach to use in these situations is to create a QTextDecoder object for the codec and use this QTextDecoder for the whole decoding process, as shown below:

QTextCodec *codec = QTextCodec::codecForName("Shift-JIS");
QTextDecoder *decoder = codec->makeDecoder();
QString string;
while (new_data_available()) {
QByteArray chunk = get_new_data();
string += decoder->toUnicode(chunk);
}
delete decoder;

The QTextDecoder object maintains state between chunks and therefore works correctly even if a multi-byte character is split between chunks.

Creating Your Own Codec Class

Support for new text encodings can be added to Qt by creating QTextCodec subclasses.

The pure virtual functions describe the encoder to the system and the coder is used as required in the different text file formats supported by QTextStream, and under X11, for the locale-specific character input and output.

To add support for another encoding to Qt, make a subclass of QTextCodec and implement the functions listed in the table below.

Function

Description

name()

Returns the official name for the encoding. If the encoding is listed in the IANA character-sets encoding file, the name should be the preferred MIME name for the encoding.

aliases()

Returns a list of alternative names for the encoding. QTextCodec provides a default implementation that returns an empty list. For example, "ISO-8859-1" has "latin1", "CP819", "IBM819", and "iso-ir-100" as aliases.

mibEnum()

Return the MIB enum for the encoding if it is listed in the IANA character-sets encoding file.

convertToUnicode()

Converts an 8-bit character string to Unicode.

convertFromUnicode() Converts a Unicode string to an 8-bit character string.

You may find it more convenient to make your codec class available as a plugin; see How to Create Qt Plugins for details.

See also
QTextStream, QTextDecoder, QTextEncoder, {Codecs Example}

Definition at line 62 of file qtextcodec.h.

Enumerations

◆ ConversionFlag

  • DefaultConversion No flag is set.
  • ConvertInvalidToNull If this flag is set, each invalid input character is output as a null character.
  • IgnoreHeader Ignore any Unicode byte-order mark and don't generate any.
  • FreeFunction
Enumerator
DefaultConversion 
ConvertInvalidToNull 
IgnoreHeader 
FreeFunction 

Definition at line 94 of file qtextcodec.h.

Constructors and Destructors

◆ QTextCodec()

QTextCodec::QTextCodec ( )
protected

Constructs a QTextCodec, and gives it the highest precedence.

The QTextCodec should always be constructed on the heap (i.e. with new). Qt takes ownership and will delete it when the application terminates.

Definition at line 985 of file qtextcodec.cpp.

986 {
987 #ifndef QT_NO_THREAD
988  QMutexLocker locker(textCodecsMutex());
989 #endif
990  setup();
991  all->prepend(this);
992 }
static QList< QTextCodec * > * all
Definition: qtextcodec.cpp:188
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
static void setup()
Definition: qtextcodec.cpp:718

◆ ~QTextCodec()

QTextCodec::~QTextCodec ( )
protectedvirtual

Destroys the QTextCodec.

Note
This class or function is not reentrant.

Note that you should not delete codecs yourself: once created they become Qt's responsibility.

Definition at line 1004 of file qtextcodec.cpp.

1005 {
1006 #ifdef Q_DEBUG_TEXTCODEC
1007  if (!destroying_is_ok)
1008  qWarning("QTextCodec::~QTextCodec: Called by application");
1009 #endif
1010  if (all) {
1011 #ifndef QT_NO_THREAD
1012  QMutexLocker locker(textCodecsMutex());
1013 #endif
1014  all->removeAll(this);
1015  QTextCodecCache *cache = qTextCodecCache();
1016  if (cache)
1017  cache->clear();
1018  }
1019 }
static QList< QTextCodec * > * all
Definition: qtextcodec.cpp:188
QTextCodec * QTextCodecCache
Definition: qtextcodec.cpp:115
Q_CORE_EXPORT void qWarning(const char *,...)
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101

Functions

◆ aliases()

QList< QByteArray > QTextCodec::aliases ( ) const
virtual

Subclasses can return a number of aliases for the codec in question.

Standard aliases for codecs can be found in the IANA character-sets encoding file.

Reimplemented in QUtf32LECodec, QUtf32BECodec, QFontGb18030_0Codec, QUtf32Codec, QFontGbkCodec, QUtf16LECodec, QUtf16BECodec, QFontKsc5601Codec, QFontBig5hkscsCodec, QUtf16Codec, QCP949Codec, QFontBig5Codec, QEucJpCodec, QJisCodec, QSjisCodec, QLatin15Codec, QEucKrCodec, QFontJis0208Codec, QSimpleTextCodec, QGbkCodec, QBig5hkscsCodec, QLatin1Codec, QGb18030Codec, QFontJis0201Codec, and QBig5Codec.

Definition at line 1287 of file qtextcodec.cpp.

Referenced by codecForName().

1288 {
1289  return QList<QByteArray>();
1290 }

◆ availableCodecs()

QList< QByteArray > QTextCodec::availableCodecs ( )
static

Returns the list of all available codecs, by name.

Call QTextCodec::codecForName() to obtain the QTextCodec for the name.

The list may contain many mentions of the same codec if the codec has aliases.

See also
availableMibs(), name(), aliases()

Definition at line 1132 of file qtextcodec.cpp.

1133 {
1134 #ifndef QT_NO_THREAD
1135  QMutexLocker locker(textCodecsMutex());
1136 #endif
1137  setup();
1138 
1140 
1141  if (!validCodecs())
1142  return codecs;
1143 
1144  for (int i = 0; i < all->size(); ++i) {
1145  codecs += all->at(i)->name();
1146  codecs += all->at(i)->aliases();
1147  }
1148 
1149 #ifndef QT_NO_THREAD
1150  locker.unlock();
1151 #endif
1152 
1153 #if !defined(QT_NO_LIBRARY) && !defined(QT_NO_TEXTCODECPLUGIN)
1154  QFactoryLoader *l = loader();
1155  QStringList keys = l->keys();
1156  for (int i = 0; i < keys.size(); ++i) {
1157  if (!keys.at(i).startsWith(QLatin1String("MIB: "))) {
1158  QByteArray name = keys.at(i).toLatin1();
1159  if (!codecs.contains(name))
1160  codecs += name;
1161  }
1162  }
1163 #endif
1164 
1165  return codecs;
1166 }
static QList< QTextCodec * > * all
Definition: qtextcodec.cpp:188
The QByteArray class provides an array of bytes.
Definition: qbytearray.h:135
bool startsWith(const QString &s, Qt::CaseSensitivity cs=Qt::CaseSensitive) const
Returns true if the string starts with s; otherwise returns false.
Definition: qstring.cpp:3734
QLatin1String(DBUS_INTERFACE_DBUS))) Q_GLOBAL_STATIC_WITH_ARGS(QString
static const Codecs codecs[]
Definition: qisciicodec.cpp:64
QStringList keys
QStringList keys() const
QBool contains(const T &t) const
Returns true if the list contains an occurrence of value; otherwise returns false.
Definition: qlist.h:880
const T & at(int i) const
Returns the item at index position i in the list.
Definition: qlist.h:468
The QStringList class provides a list of strings.
Definition: qstringlist.h:66
QByteArray toLatin1() const Q_REQUIRED_RESULT
Returns a Latin-1 representation of the string as a QByteArray.
Definition: qstring.cpp:3993
virtual QByteArray name() const =0
QTextCodec subclasses must reimplement this function.
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
int size() const
Returns the number of items in the list.
Definition: qlist.h:137
QFactoryLoader * l
static bool validCodecs()
Definition: qtextcodec.cpp:233
static void setup()
Definition: qtextcodec.cpp:718

◆ availableMibs()

QList< int > QTextCodec::availableMibs ( )
static

Returns the list of MIBs for all available codecs.

Call QTextCodec::codecForMib() to obtain the QTextCodec for the MIB.

See also
availableCodecs(), mibEnum()

Definition at line 1174 of file qtextcodec.cpp.

1175 {
1176 #ifndef QT_NO_THREAD
1177  QMutexLocker locker(textCodecsMutex());
1178 #endif
1179  setup();
1180 
1182 
1183  if (!validCodecs())
1184  return codecs;
1185 
1186  for (int i = 0; i < all->size(); ++i)
1187  codecs += all->at(i)->mibEnum();
1188 
1189 #ifndef QT_NO_THREAD
1190  locker.unlock();
1191 #endif
1192 
1193 #if !defined(QT_NO_LIBRARY) && !defined(QT_NO_TEXTCODECPLUGIN)
1194  QFactoryLoader *l = loader();
1195  QStringList keys = l->keys();
1196  for (int i = 0; i < keys.size(); ++i) {
1197  if (keys.at(i).startsWith(QLatin1String("MIB: "))) {
1198  int mib = keys.at(i).mid(5).toInt();
1199  if (!codecs.contains(mib))
1200  codecs += mib;
1201  }
1202  }
1203 #endif
1204 
1205  return codecs;
1206 }
static QList< QTextCodec * > * all
Definition: qtextcodec.cpp:188
int toInt(bool *ok=0, int base=10) const
Returns the string converted to an int using base base, which is 10 by default and must be between 2 ...
Definition: qstring.cpp:6090
bool startsWith(const QString &s, Qt::CaseSensitivity cs=Qt::CaseSensitive) const
Returns true if the string starts with s; otherwise returns false.
Definition: qstring.cpp:3734
QLatin1String(DBUS_INTERFACE_DBUS))) Q_GLOBAL_STATIC_WITH_ARGS(QString
static const Codecs codecs[]
Definition: qisciicodec.cpp:64
QStringList keys
QStringList keys() const
QBool contains(const T &t) const
Returns true if the list contains an occurrence of value; otherwise returns false.
Definition: qlist.h:880
const T & at(int i) const
Returns the item at index position i in the list.
Definition: qlist.h:468
The QStringList class provides a list of strings.
Definition: qstringlist.h:66
int mib
QString mid(int position, int n=-1) const Q_REQUIRED_RESULT
Returns a string that contains n characters of this string, starting at the specified position index...
Definition: qstring.cpp:3706
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
int size() const
Returns the number of items in the list.
Definition: qlist.h:137
QFactoryLoader * l
static bool validCodecs()
Definition: qtextcodec.cpp:233
static void setup()
Definition: qtextcodec.cpp:718

◆ canEncode() [1/2]

bool QTextCodec::canEncode ( QChar  ch) const

Returns true if the Unicode character ch can be fully encoded with this codec; otherwise returns false.

Definition at line 1417 of file qtextcodec.cpp.

Referenced by encodeText().

1418 {
1419  ConverterState state;
1420  state.flags = ConvertInvalidToNull;
1421  convertFromUnicode(&ch, 1, &state);
1422  return (state.invalidChars == 0);
1423 }
virtual QByteArray convertFromUnicode(const QChar *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.

◆ canEncode() [2/2]

bool QTextCodec::canEncode ( const QString s) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.s contains the string being tested for encode-ability.

Definition at line 1430 of file qtextcodec.cpp.

1431 {
1432  ConverterState state;
1433  state.flags = ConvertInvalidToNull;
1434  convertFromUnicode(s.constData(), s.length(), &state);
1435  return (state.invalidChars == 0);
1436 }
int length() const
Returns the number of characters in this string.
Definition: qstring.h:696
virtual QByteArray convertFromUnicode(const QChar *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.
const QChar * constData() const
Returns a pointer to the data stored in the QString.
Definition: qstring.h:712

◆ codecForCStrings()

QTextCodec * QTextCodec::codecForCStrings ( )
inlinestatic

Returns the codec used by QString to convert to and from const char * and QByteArrays.

If this function returns 0 (the default), QString assumes Latin-1.

See also
setCodecForCStrings()

Definition at line 157 of file qtextcodec.h.

Referenced by QChar::fromAscii(), QChar::QChar(), and QChar::toAscii().

157 { return validCodecs() ? QString::codecForCStrings : 0; }
static QTextCodec * codecForCStrings
Definition: qstring.h:621
static bool validCodecs()
Definition: qtextcodec.cpp:233

◆ codecForHtml() [1/2]

QTextCodec * QTextCodec::codecForHtml ( const QByteArray ba)
static

Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

If the codec cannot be detected, this overload returns a Latin-1 QTextCodec.

Definition at line 1807 of file qtextcodec.cpp.

Referenced by Qt::codecForHtml(), QDeclarativeXMLHttpRequest::findTextCodec(), QMimeDataPrivate::retrieveTypedData(), and QClipboard::text().

1808 {
1809  return codecForHtml(ba, QTextCodec::codecForMib(/*Latin 1*/ 4));
1810 }
static QTextCodec * codecForHtml(const QByteArray &ba)
Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode.
static QTextCodec * codecForMib(int mib)
Returns the QTextCodec which matches the MIBenum mib.

◆ codecForHtml() [2/2]

QTextCodec * QTextCodec::codecForHtml ( const QByteArray ba,
QTextCodec defaultCodec 
)
static

Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode.

Since
4.4

If the codec cannot be detected from the content provided, defaultCodec is returned.

See also
codecForUtfText()

Definition at line 1768 of file qtextcodec.cpp.

1769 {
1770  // determine charset
1771  int pos;
1772  QTextCodec *c = 0;
1773 
1774  c = QTextCodec::codecForUtfText(ba, c);
1775  if (!c) {
1776  QByteArray header = ba.left(512).toLower();
1777  if ((pos = header.indexOf("http-equiv=")) != -1) {
1778  if ((pos = header.lastIndexOf("meta ", pos)) != -1) {
1779  pos = header.indexOf("charset=", pos) + int(strlen("charset="));
1780  if (pos != -1) {
1781  int pos2 = header.indexOf('\"', pos+1);
1782  QByteArray cs = header.mid(pos, pos2-pos);
1783  // qDebug("found charset: %s", cs.data());
1784  c = QTextCodec::codecForName(cs);
1785  }
1786  }
1787  }
1788  }
1789  if (!c)
1790  c = defaultCodec;
1791 
1792  return c;
1793 }
unsigned char c[8]
Definition: qnumeric_p.h:62
The QByteArray class provides an array of bytes.
Definition: qbytearray.h:135
QByteArray toLower() const
Returns a lowercase copy of the byte array.
int lastIndexOf(char c, int from=-1) const
Returns the index position of the last occurrence of character ch in the byte array, searching backward from index position from.
QByteArray left(int len) const
Returns a byte array that contains the leftmost len bytes of this byte array.
QByteArray mid(int index, int len=-1) const
Returns a byte array containing len bytes from this byte array, starting at position pos...
int indexOf(char c, int from=0) const
Returns the index position of the first occurrence of the character ch in the byte array...
static QTextCodec * codecForName(const QByteArray &name)
Searches all installed QTextCodec objects and returns the one which best matches name; the match is c...
static QTextCodec * codecForUtfText(const QByteArray &ba)
Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and return...
The QTextCodec class provides conversions between text encodings.
Definition: qtextcodec.h:62

◆ codecForLocale()

QTextCodec * QTextCodec::codecForLocale ( )
static

Returns a pointer to the codec most suitable for this locale.

On Windows, the codec will be based on a system locale. On Unix systems, starting with Qt 4.2, the codec will be using the iconv library. Note that in both cases the codec's name will be "System".

Definition at line 1237 of file qtextcodec.cpp.

Referenced by codec(), QTextStreamPrivate::fillReadBuffer(), QTextStreamPrivate::flushWriteBuffer(), QString::fromLocal8Bit(), QTextStream::locale(), QX11Data::motifdndFormat(), QX11Data::motifdndObtainData(), QCUPSSupport::QCUPSSupport(), qstring_to_xtp(), qt_set_input_encoding(), qt_x11_set_fallback_font_family(), QTextStreamPrivate::reset(), QString::toAscii(), QString::toLocal8Bit(), QStringRef::toLocal8Bit(), QXlibKeyboard::translateKeySym(), and QXcbKeyboard::translateKeySym().

1238 {
1239  if (!validCodecs())
1240  return 0;
1241 
1242  if (localeMapper)
1243  return localeMapper;
1244 
1245 #ifndef QT_NO_THREAD
1246  QMutexLocker locker(textCodecsMutex());
1247 #endif
1248  setup();
1249 
1250  return localeMapper;
1251 }
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
static QTextCodec * localeMapper
Definition: qtextcodec.cpp:193
static void setup()
Definition: qtextcodec.cpp:718
static bool validCodecs()
Definition: qtextcodec.cpp:233

◆ codecForMib()

QTextCodec * QTextCodec::codecForMib ( int  mib)
static

Returns the QTextCodec which matches the MIBenum mib.

Definition at line 1082 of file qtextcodec.cpp.

Referenced by codecForHtml(), codecForUtfText(), QXmlInputSource::fromRawData(), QXmlStreamReaderPrivate::getChar_helper(), QXmlStreamReaderPrivate::init(), QFontEngineXLFD::QFontEngineXLFD(), QIconvCodec::QIconvCodec(), QXmlStreamWriterPrivate::QXmlStreamWriterPrivate(), QPatternist::AccelTreeResourceLoader::retrieveUnparsedText(), QClipboard::text(), QXlibKeyboard::translateKeySym(), QXcbKeyboard::translateKeySym(), and translateKeySym().

1083 {
1084 #ifndef QT_NO_THREAD
1085  QMutexLocker locker(textCodecsMutex());
1086 #endif
1087  setup();
1088 
1089  if (!validCodecs())
1090  return 0;
1091 
1092  QByteArray key = "MIB: " + QByteArray::number(mib);
1093  QTextCodecCache *cache = qTextCodecCache();
1094  QTextCodec *codec;
1095  if (cache) {
1096  codec = cache->value(key);
1097  if (codec)
1098  return codec;
1099  }
1100 
1102  for (int i = 0; i < all->size(); ++i) {
1103  QTextCodec *cursor = all->at(i);
1104  if (cursor->mibEnum() == mib) {
1105  if (cache)
1106  cache->insert(key, cursor);
1107  return cursor;
1108  }
1109  }
1110 
1111  codec = createForMib(mib);
1112 
1113  // Qt 3 used 1000 (mib for UCS2) as its identifier for the utf16 codec. Map
1114  // this correctly for compatibility.
1115  if (!codec && mib == 1000)
1116  return codecForMib(1015);
1117 
1118  if (codec && cache)
1119  cache->insert(key, codec);
1120  return codec;
1121 }
static QList< QTextCodec * > * all
Definition: qtextcodec.cpp:188
The QByteArray class provides an array of bytes.
Definition: qbytearray.h:135
QTextCodec * QTextCodecCache
Definition: qtextcodec.cpp:115
The QList::const_iterator class provides an STL-style const iterator for QList and QQueue...
Definition: qlist.h:228
virtual int mibEnum() const =0
Subclasses of QTextCodec must reimplement this function.
static QTextCodec * codec(MYSQL *mysql)
Definition: qsql_mysql.cpp:220
int mib
static QTextCodec * codecForMib(int mib)
Returns the QTextCodec which matches the MIBenum mib.
static QTextCodec * createForMib(int mib)
Definition: qtextcodec.cpp:175
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
int key
The QTextCodec class provides conversions between text encodings.
Definition: qtextcodec.h:62
static QByteArray number(int, int base=10)
Returns a byte array containing the string equivalent of the number n to base base (10 by default)...
static bool validCodecs()
Definition: qtextcodec.cpp:233
static void setup()
Definition: qtextcodec.cpp:718

◆ codecForName() [1/2]

QTextCodec * QTextCodec::codecForName ( const QByteArray name)
static

Searches all installed QTextCodec objects and returns the one which best matches name; the match is case-insensitive.

Returns 0 if no codec matching the name name could be found.

Definition at line 1034 of file qtextcodec.cpp.

Referenced by codec(), codecForHtml(), QSvgPaintEngine::end(), QDeclarativeXMLHttpRequest::findTextCodec(), QXmlInputSource::fromRawData(), QTextStream::locale(), QWindowsLocalCodec::mibEnum(), QXlibMime::mimeConvertToFormat(), QIBaseDriver::open(), QCUPSSupport::QCUPSSupport(), QApplicationPrivate::qt_mac_apply_settings(), qt_set_input_encoding(), QMimeDataPrivate::retrieveTypedData(), QPatternist::AccelTreeResourceLoader::retrieveUnparsedText(), QDomDocumentPrivate::saveDocument(), QTextDocumentWriter::setCodec(), QTextStream::setCodec(), QXmlStreamWriter::setCodec(), QSettings::setIniCodec(), setupLocaleMapper(), QXmlStreamReaderPrivate::startDocument(), translateKeyEventInternal(), QKeyMapperPrivate::updateKeyMap(), QApplicationPrivate::x11_apply_settings(), and QX11Data::xdndMimeConvertToFormat().

1035 {
1036  if (name.isEmpty())
1037  return 0;
1038 
1039 #ifndef QT_NO_THREAD
1040  QMutexLocker locker(textCodecsMutex());
1041 #endif
1042  setup();
1043 
1044  if (!validCodecs())
1045  return 0;
1046 
1047  QTextCodecCache *cache = qTextCodecCache();
1048  QTextCodec *codec;
1049  if (cache) {
1050  codec = cache->value(name);
1051  if (codec)
1052  return codec;
1053  }
1054 
1055  for (int i = 0; i < all->size(); ++i) {
1056  QTextCodec *cursor = all->at(i);
1057  if (nameMatch(cursor->name(), name)) {
1058  if (cache)
1059  cache->insert(name, cursor);
1060  return cursor;
1061  }
1062  QList<QByteArray> aliases = cursor->aliases();
1063  for (int y = 0; y < aliases.size(); ++y)
1064  if (nameMatch(aliases.at(y), name)) {
1065  if (cache)
1066  cache->insert(name, cursor);
1067  return cursor;
1068  }
1069  }
1070 
1071  codec = createForName(name);
1072  if (codec && cache)
1073  cache->insert(name, codec);
1074  return codec;
1075 }
static QTextCodec * createForName(const QByteArray &name)
Definition: qtextcodec.cpp:155
static QList< QTextCodec * > * all
Definition: qtextcodec.cpp:188
static bool nameMatch(const QByteArray &name, const QByteArray &test)
Definition: qtextcodec.cpp:124
QTextCodec * QTextCodecCache
Definition: qtextcodec.cpp:115
void insert(int i, const T &t)
Inserts value at index position i in the list.
Definition: qlist.h:575
virtual QList< QByteArray > aliases() const
Subclasses can return a number of aliases for the codec in question.
const T & at(int i) const
Returns the item at index position i in the list.
Definition: qlist.h:468
static QTextCodec * codec(MYSQL *mysql)
Definition: qsql_mysql.cpp:220
virtual QByteArray name() const =0
QTextCodec subclasses must reimplement this function.
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
int size() const
Returns the number of items in the list.
Definition: qlist.h:137
bool isEmpty() const
Returns true if the byte array has size 0; otherwise returns false.
Definition: qbytearray.h:421
The QTextCodec class provides conversions between text encodings.
Definition: qtextcodec.h:62
static bool validCodecs()
Definition: qtextcodec.cpp:233
static void setup()
Definition: qtextcodec.cpp:718

◆ codecForName() [2/2]

QTextCodec * QTextCodec::codecForName ( const char *  name)
inlinestatic

Searches all installed QTextCodec objects and returns the one which best matches name; the match is case-insensitive.

Returns 0 if no codec matching the name name could be found.

Definition at line 67 of file qtextcodec.h.

Referenced by codecForName().

67 { return codecForName(QByteArray(name)); }
The QByteArray class provides an array of bytes.
Definition: qbytearray.h:135
virtual QByteArray name() const =0
QTextCodec subclasses must reimplement this function.
static QTextCodec * codecForName(const QByteArray &name)
Searches all installed QTextCodec objects and returns the one which best matches name; the match is c...

◆ codecForTr()

QTextCodec * QTextCodec::codecForTr ( )
inlinestatic

Returns the codec used by QObject::tr() on its argument.

If this function returns 0 (the default), tr() assumes Latin-1.

See also
setCodecForTr()

Definition at line 155 of file qtextcodec.h.

Referenced by QCoreApplication::translate().

155 { return validCodecs() ? cftr : 0; }
static QTextCodec * cftr
Definition: qtextcodec.h:150
static bool validCodecs()
Definition: qtextcodec.cpp:233

◆ codecForUtfText() [1/2]

QTextCodec * QTextCodec::codecForUtfText ( const QByteArray ba)
static

Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and returns a QTextCodec instance that is capable of decoding the text to unicode.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

If the codec cannot be detected, this overload returns a Latin-1 QTextCodec.

See also
codecForHtml()

Definition at line 1873 of file qtextcodec.cpp.

Referenced by codecForHtml(), QTextStreamPrivate::fillReadBuffer(), QDeclarativeXMLHttpRequest::findTextCodec(), and QClipboard::text().

1874 {
1875  return codecForUtfText(ba, QTextCodec::codecForMib(/*Latin 1*/ 4));
1876 }
static QTextCodec * codecForMib(int mib)
Returns the QTextCodec which matches the MIBenum mib.
static QTextCodec * codecForUtfText(const QByteArray &ba)
Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and return...

◆ codecForUtfText() [2/2]

QTextCodec * QTextCodec::codecForUtfText ( const QByteArray ba,
QTextCodec defaultCodec 
)
static

Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and returns a QTextCodec instance that is capable of decoding the text to unicode.

Since
4.6

If the codec cannot be detected from the content provided, defaultCodec is returned.

See also
codecForHtml()

Definition at line 1826 of file qtextcodec.cpp.

1827 {
1828  const int arraySize = ba.size();
1829 
1830  if (arraySize > 3) {
1831  if ((uchar)ba[0] == 0x00
1832  && (uchar)ba[1] == 0x00
1833  && (uchar)ba[2] == 0xFE
1834  && (uchar)ba[3] == 0xFF)
1835  return QTextCodec::codecForMib(1018); // utf-32 be
1836  else if ((uchar)ba[0] == 0xFF
1837  && (uchar)ba[1] == 0xFE
1838  && (uchar)ba[2] == 0x00
1839  && (uchar)ba[3] == 0x00)
1840  return QTextCodec::codecForMib(1019); // utf-32 le
1841  }
1842 
1843  if (arraySize < 2)
1844  return defaultCodec;
1845  if ((uchar)ba[0] == 0xfe && (uchar)ba[1] == 0xff)
1846  return QTextCodec::codecForMib(1013); // utf16 be
1847  else if ((uchar)ba[0] == 0xff && (uchar)ba[1] == 0xfe)
1848  return QTextCodec::codecForMib(1014); // utf16 le
1849 
1850  if (arraySize < 3)
1851  return defaultCodec;
1852  if ((uchar)ba[0] == 0xef
1853  && (uchar)ba[1] == 0xbb
1854  && (uchar)ba[2] == 0xbf)
1855  return QTextCodec::codecForMib(106); // utf-8
1856 
1857  return defaultCodec;
1858 }
unsigned char uchar
Definition: qglobal.h:994
static QTextCodec * codecForMib(int mib)
Returns the QTextCodec which matches the MIBenum mib.
int size() const
Returns the number of bytes in this byte array.
Definition: qbytearray.h:402

◆ convertFromUnicode()

QByteArray QTextCodec::convertFromUnicode ( const QChar input,
int  number,
ConverterState state 
) const
protectedpure virtual

QTextCodec subclasses must reimplement this function.

Converts the first number of characters from the input array from Unicode to the encoding of the subclass, and returns the result in a QByteArray.

state can be 0 in which case the conversion is stateless and default conversion rules should be used. If state is not 0, the codec should save the state after the conversion in state, and adjust the remainingChars and invalidChars members of the struct.

Implemented in QWindowsLocalCodec, QFontGb18030_0Codec, QUtf32Codec, QFontGbkCodec, QFontKsc5601Codec, QFontGb2312Codec, QFontBig5hkscsCodec, QUtf16Codec, QCP949Codec, QTsciiCodec, QFontBig5Codec, QGb2312Codec, QUtf8Codec, QEucJpCodec, QJisCodec, QSjisCodec, QEucKrCodec, QFontJis0208Codec, QLatin15Codec, QGbkCodec, QBig5hkscsCodec, QIconvCodec, QSimpleTextCodec, QFontLaoCodec, QIsciiCodec, QLatin1Codec, QGb18030Codec, QBig5Codec, and QFontJis0201Codec.

◆ convertToUnicode()

QString QTextCodec::convertToUnicode ( const char *  chars,
int  len,
ConverterState state 
) const
protectedpure virtual

QTextCodec subclasses must reimplement this function.

Converts the first len characters of chars from the encoding of the subclass to Unicode, and returns the result in a QString.

state can be 0, in which case the conversion is stateless and default conversion rules should be used. If state is not 0, the codec should save the state after the conversion in state, and adjust the remainingChars and invalidChars members of the struct.

Implemented in QWindowsLocalCodec, QFontGb18030_0Codec, QUtf32Codec, QFontGbkCodec, QFontKsc5601Codec, QFontGb2312Codec, QFontBig5hkscsCodec, QUtf16Codec, QCP949Codec, QTsciiCodec, QFontBig5Codec, QGb2312Codec, QUtf8Codec, QEucJpCodec, QJisCodec, QSjisCodec, QEucKrCodec, QFontJis0208Codec, QLatin15Codec, QGbkCodec, QBig5hkscsCodec, QIconvCodec, QSimpleTextCodec, QFontLaoCodec, QIsciiCodec, QLatin1Codec, QFontJis0201Codec, QGb18030Codec, and QBig5Codec.

◆ fromUnicode() [1/2]

QByteArray QTextCodec::fromUnicode ( const QString uc) const

Converts str from Unicode to the encoding of this codec, and returns the result in a QByteArray.

Definition at line 1388 of file qtextcodec.cpp.

Referenced by canEncode(), QAbstractConcatenable::convertToAscii(), encodeString(), QTextStreamPrivate::flushWriteBuffer(), fromUnicode(), QTextEncoder::fromUnicode(), QSettingsPrivate::iniEscapedString(), qstring_to_xtp(), QFontEngineXLFD::stringToCMap(), QChar::toAscii(), QString::toAscii(), QStringRef::toAscii(), QString::toLocal8Bit(), and QStringRef::toLocal8Bit().

1389 {
1390  return convertFromUnicode(str.constData(), str.length(), 0);
1391 }
virtual QByteArray convertFromUnicode(const QChar *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.

◆ fromUnicode() [2/2]

QByteArray QTextCodec::fromUnicode ( const QChar input,
int  number,
ConverterState state = 0 
) const
inline

Converts the first number of characters from the input array from Unicode to the encoding of this codec, and returns the result in a QByteArray.

The state of the convertor used is updated.

Definition at line 117 of file qtextcodec.h.

118  { return convertFromUnicode(in, length, state); }
virtual QByteArray convertFromUnicode(const QChar *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.

◆ makeDecoder() [1/2]

QTextDecoder * QTextCodec::makeDecoder ( ) const

Creates a QTextDecoder which stores enough state to decode chunks of char * data to create chunks of Unicode data.

The caller is responsible for deleting the returned object.

Definition at line 1330 of file qtextcodec.cpp.

Referenced by QXmlInputSource::fromRawData(), QXmlStreamReaderPrivate::getChar_helper(), and QXmlStreamReaderPrivate::startDocument().

1331 {
1332  return new QTextDecoder(this);
1333 }
The QTextDecoder class provides a state-based decoder.
Definition: qtextcodec.h:177

◆ makeDecoder() [2/2]

QTextDecoder* QTextCodec::makeDecoder ( ConversionFlags  flags) const

◆ makeEncoder() [1/2]

QTextEncoder * QTextCodec::makeEncoder ( ) const

Creates a QTextEncoder which stores enough state to encode chunks of Unicode data as char * data.

The caller is responsible for deleting the returned object.

Definition at line 1355 of file qtextcodec.cpp.

Referenced by QXmlStreamWriterPrivate::QXmlStreamWriterPrivate(), and QXmlStreamWriter::setCodec().

1356 {
1357  return new QTextEncoder(this);
1358 }
The QTextEncoder class provides a state-based encoder.
Definition: qtextcodec.h:160

◆ makeEncoder() [2/2]

QTextEncoder* QTextCodec::makeEncoder ( ConversionFlags  flags) const

◆ mibEnum()

int QTextCodec::mibEnum ( ) const
pure virtual

◆ name()

QByteArray QTextCodec::name ( ) const
pure virtual

◆ setCodecForCStrings()

void QTextCodec::setCodecForCStrings ( QTextCodec codec)
inlinestatic
Note
This class or function is not reentrant.

Sets the codec used by QString to convert to and from const char * and QByteArrays. If the codec is 0 (the default), QString assumes Latin-1.

Warning
Some codecs do not preserve the characters in the ASCII range (0x00 to 0x7F). For example, the Japanese Shift-JIS encoding maps the backslash character (0x5A) to the Yen character. To avoid undesirable side-effects, we recommend avoiding such codecs with setCodecsForCString().
See also
codecForCStrings(), setCodecForTr()

Definition at line 158 of file qtextcodec.h.

unsigned char c[8]
Definition: qnumeric_p.h:62
static QTextCodec * codecForCStrings
Definition: qstring.h:621

◆ setCodecForLocale()

void QTextCodec::setCodecForLocale ( QTextCodec c)
static

Set the codec to c; this will be returned by codecForLocale().

If c is a null pointer, the codec is reset to the default.

This might be needed for some applications that want to use their own mechanism for setting the locale.

See also
codecForLocale()

Definition at line 1218 of file qtextcodec.cpp.

1219 {
1220 #ifndef QT_NO_THREAD
1221  QMutexLocker locker(textCodecsMutex());
1222 #endif
1223  localeMapper = c;
1224  if (!localeMapper)
1226 }
unsigned char c[8]
Definition: qnumeric_p.h:62
static void setupLocaleMapper()
Definition: qtextcodec.cpp:581
The QMutexLocker class is a convenience class that simplifies locking and unlocking mutexes...
Definition: qmutex.h:101
static QTextCodec * localeMapper
Definition: qtextcodec.cpp:193

◆ setCodecForTr()

void QTextCodec::setCodecForTr ( QTextCodec c)
inlinestatic
Note
This class or function is not reentrant.

Sets the codec used by QObject::tr() on its argument to c. If c is 0 (the default), tr() assumes Latin-1.

If the literal quoted text in the program is not in the Latin-1 encoding, this function can be used to set the appropriate encoding. For example, software developed by Korean programmers might use eucKR for all the text in the program, in which case the main() function might look like this:

int main(int argc, char *argv[])
{
QApplication app(argc, argv);
...
}

Note that this is not the way to select the encoding that the user has chosen. For example, to convert an application containing literal English strings to Korean, all that is needed is for the English strings to be passed through tr() and for translation files to be loaded. For details of internationalization, see Internationalization with Qt.

See also
codecForTr(), setCodecForCStrings()

Definition at line 156 of file qtextcodec.h.

Referenced by QApplicationPrivate::qt_mac_apply_settings(), and QApplicationPrivate::x11_apply_settings().

156 { cftr = c; }
unsigned char c[8]
Definition: qnumeric_p.h:62
static QTextCodec * cftr
Definition: qtextcodec.h:150

◆ toUnicode() [1/3]

QString QTextCodec::toUnicode ( const QByteArray a) const

Converts a from the encoding of this codec to Unicode, and returns the result in a QString.

Definition at line 1408 of file qtextcodec.cpp.

Referenced by canEncode(), QIconvCodec::convertToUnicode(), QTextStreamPrivate::fillReadBuffer(), QChar::fromAscii(), QString::fromAscii_helper(), QString::fromLocal8Bit(), getIBaseError(), QIBaseResult::gotoNext(), QSettingsPrivate::iniUnescapedStringList(), QXlibMime::mimeConvertToFormat(), QChar::QChar(), readArrayBuffer(), QDeclarativeXMLHttpRequest::responseBody(), QMimeDataPrivate::retrieveTypedData(), QPatternist::AccelTreeResourceLoader::retrieveUnparsedText(), QTextBrowserPrivate::setSource(), QClipboard::text(), toUnicode(), QFontEngineXLFD::toUnicode(), QTextDecoder::toUnicode(), QCoreApplication::translate(), translateKeyEventInternal(), QXlibKeyboard::translateKeySym(), QXcbKeyboard::translateKeySym(), translateKeySym(), QCUPSSupport::unicodeString(), QKeyMapperPrivate::updateKeyMap(), QString::vsprintf(), QXIMInputContext::x11FilterEvent(), and QX11Data::xdndMimeConvertToFormat().

1409 {
1410  return convertToUnicode(a.constData(), a.length(), 0);
1411 }
int length() const
Same as size().
Definition: qbytearray.h:356
const char * constData() const
Returns a pointer to the data stored in the byte array.
Definition: qbytearray.h:433
virtual QString convertToUnicode(const char *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.

◆ toUnicode() [2/3]

QString QTextCodec::toUnicode ( const char *  chars) const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.chars contains the source characters.

Definition at line 1485 of file qtextcodec.cpp.

1486 {
1487  int len = qstrlen(chars);
1488  return convertToUnicode(chars, len, 0);
1489 }
uint qstrlen(const char *str)
Definition: qbytearray.h:79
virtual QString convertToUnicode(const char *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.

◆ toUnicode() [3/3]

QString QTextCodec::toUnicode ( const char *  input,
int  size,
ConverterState state = 0 
) const
inline

Converts the first size characters from the input from the encoding of this codec to Unicode, and returns the result in a QString.

The state of the convertor used is updated.

Definition at line 115 of file qtextcodec.h.

116  { return convertToUnicode(in, length, state); }
virtual QString convertToUnicode(const char *in, int length, ConverterState *state) const =0
QTextCodec subclasses must reimplement this function.

◆ validCodecs()

bool QTextCodec::validCodecs ( )
staticprivate

Definition at line 233 of file qtextcodec.cpp.

234 {
235 #ifdef Q_OS_SYMBIAN
236  // If we don't have a trap handler, we're outside of the main() function,
237  // ie. in global constructors or destructors. Don't use codecs in this
238  // case as it would lead to crashes because we don't have a cleanup stack on Symbian
239  return (User::TrapHandler() != NULL);
240 #else
241  return true;
242 #endif
243 }

Friends and Related Functions

◆ QTextCodecCleanup

friend class QTextCodecCleanup
friend

Definition at line 149 of file qtextcodec.h.

Properties

◆ cftr

QTextCodec * QTextCodec::cftr = 0
staticprivate

Definition at line 150 of file qtextcodec.h.


The documentation for this class was generated from the following files: