Qt 4.8
Public Functions | Static Public Functions | List of all members
QGb2312Codec Class Reference

The QGb2312Codec class provides conversion to and from the Chinese GB2312 encoding. More...

#include <qgb18030codec.h>

Inheritance diagram for QGb2312Codec:
QGb18030Codec QTextCodec

Public Functions

QByteArray convertFromUnicode (const QChar *, int, ConverterState *) const
 Reimplemented Function More...
 
QString convertToUnicode (const char *, int, ConverterState *) const
 QTextCodec subclasses must reimplement this function. More...
 
int mibEnum () const
 Subclasses of QTextCodec must reimplement this function. More...
 
QByteArray name () const
 QTextCodec subclasses must reimplement this function. More...
 
 QGb2312Codec ()
 Constructs a QGb2312Codec object. More...
 
- Public Functions inherited from QGb18030Codec
QList< QByteArrayaliases () const
 Subclasses can return a number of aliases for the codec in question. More...
 
 QGb18030Codec ()
 
- Public Functions inherited from QTextCodec
bool canEncode (QChar) const
 Returns true if the Unicode character ch can be fully encoded with this codec; otherwise returns false. More...
 
bool canEncode (const QString &) const
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.s contains the string being tested for encode-ability. More...
 
QByteArray fromUnicode (const QString &uc) const
 Converts str from Unicode to the encoding of this codec, and returns the result in a QByteArray. More...
 
QByteArray fromUnicode (const QChar *in, int length, ConverterState *state=0) const
 Converts the first number of characters from the input array from Unicode to the encoding of this codec, and returns the result in a QByteArray. More...
 
QTextDecodermakeDecoder () const
 Creates a QTextDecoder which stores enough state to decode chunks of char * data to create chunks of Unicode data. More...
 
QTextDecodermakeDecoder (ConversionFlags flags) const
 
QTextEncodermakeEncoder () const
 Creates a QTextEncoder which stores enough state to encode chunks of Unicode data as char * data. More...
 
QTextEncodermakeEncoder (ConversionFlags flags) const
 
QString toUnicode (const QByteArray &) const
 Converts a from the encoding of this codec to Unicode, and returns the result in a QString. More...
 
QString toUnicode (const char *chars) const
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.chars contains the source characters. More...
 
QString toUnicode (const char *in, int length, ConverterState *state=0) const
 Converts the first size characters from the input from the encoding of this codec to Unicode, and returns the result in a QString. More...
 

Static Public Functions

static QList< QByteArray_aliases ()
 
static int _mibEnum ()
 
static QByteArray _name ()
 
- Static Public Functions inherited from QGb18030Codec
static QList< QByteArray_aliases ()
 
static int _mibEnum ()
 
static QByteArray _name ()
 
- Static Public Functions inherited from QTextCodec
static QList< QByteArrayavailableCodecs ()
 Returns the list of all available codecs, by name. More...
 
static QList< int > availableMibs ()
 Returns the list of MIBs for all available codecs. More...
 
static QTextCodeccodecForCStrings ()
 Returns the codec used by QString to convert to and from const char * and QByteArrays. More...
 
static QTextCodeccodecForHtml (const QByteArray &ba)
 Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode. More...
 
static QTextCodeccodecForHtml (const QByteArray &ba, QTextCodec *defaultCodec)
 Tries to detect the encoding of the provided snippet of HTML in the given byte array, ba, by checking the BOM (Byte Order Mark) and the content-type meta header and returns a QTextCodec instance that is capable of decoding the html to unicode. More...
 
static QTextCodeccodecForLocale ()
 Returns a pointer to the codec most suitable for this locale. More...
 
static QTextCodeccodecForMib (int mib)
 Returns the QTextCodec which matches the MIBenum mib. More...
 
static QTextCodeccodecForName (const QByteArray &name)
 Searches all installed QTextCodec objects and returns the one which best matches name; the match is case-insensitive. More...
 
static QTextCodeccodecForName (const char *name)
 Searches all installed QTextCodec objects and returns the one which best matches name; the match is case-insensitive. More...
 
static QTextCodeccodecForTr ()
 Returns the codec used by QObject::tr() on its argument. More...
 
static QTextCodeccodecForUtfText (const QByteArray &ba)
 Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and returns a QTextCodec instance that is capable of decoding the text to unicode. More...
 
static QTextCodeccodecForUtfText (const QByteArray &ba, QTextCodec *defaultCodec)
 Tries to detect the encoding of the provided snippet ba by using the BOM (Byte Order Mark) and returns a QTextCodec instance that is capable of decoding the text to unicode. More...
 
static void setCodecForCStrings (QTextCodec *c)
 
static void setCodecForLocale (QTextCodec *c)
 Set the codec to c; this will be returned by codecForLocale(). More...
 
static void setCodecForTr (QTextCodec *c)
 

Additional Inherited Members

- Public Types inherited from QTextCodec
enum  ConversionFlag { DefaultConversion, ConvertInvalidToNull = 0x80000000, IgnoreHeader = 0x1, FreeFunction = 0x2 }
 
- Protected Functions inherited from QTextCodec
 QTextCodec ()
 Constructs a QTextCodec, and gives it the highest precedence. More...
 
virtual ~QTextCodec ()
 Destroys the QTextCodec. More...
 

Detailed Description

The QGb2312Codec class provides conversion to and from the Chinese GB2312 encoding.

Warning
This function is not part of the public interface.

The GB2312 encoding has been superseded by the GB18030 encoding and GB18030 is backward compatible to GB2312. For this reason the QGb2312Codec class is implemented in terms of the GB18030 codec and uses its 0xA1A1-0xFEFE subset for conversion from and to Unicode.

The QGb2312Codec is kept mainly for compatibility reasons with older software.

Definition at line 86 of file qgb18030codec.h.

Constructors and Destructors

◆ QGb2312Codec()

QGb2312Codec::QGb2312Codec ( )

Constructs a QGb2312Codec object.

Definition at line 457 of file qgb18030codec.cpp.

458  : QGb18030Codec()
459 {
460 }

Functions

◆ _aliases()

static QList<QByteArray> QGb2312Codec::_aliases ( )
inlinestatic

Definition at line 91 of file qgb18030codec.h.

Referenced by CNTextCodecs::aliases(), and CNTextCodecs::createForName().

◆ _mibEnum()

int QGb2312Codec::_mibEnum ( )
static

Definition at line 462 of file qgb18030codec.cpp.

Referenced by CNTextCodecs::createForMib(), and CNTextCodecs::mibEnums().

463 {
464  return 2025;
465 }

◆ _name()

QByteArray QGb2312Codec::_name ( )
static

Definition at line 467 of file qgb18030codec.cpp.

Referenced by CNTextCodecs::createForName(), and CNTextCodecs::names().

468 {
469  return "GB2312";
470 }

◆ convertFromUnicode()

QByteArray QGb2312Codec::convertFromUnicode ( const QChar uc,
int  len,
ConverterState state 
) const
virtual

Reimplemented Function

Reimplemented from QGb18030Codec.

Definition at line 548 of file qgb18030codec.cpp.

549 {
550  char replacement = '?';
551  if (state) {
552  if (state->flags & ConvertInvalidToNull)
553  replacement = 0;
554  }
555  int invalid = 0;
556 
557  int rlen = 2*len + 1;
558  QByteArray rstr;
559  rstr.resize(rlen);
560  uchar* cursor = (uchar*)rstr.data();
561 
562  //qDebug("QGb2312Codec::fromUnicode(const QString& uc, int& lenInOut = %d) const", lenInOut);
563  for (int i = 0; i < len; i++) {
564  QChar ch = uc[i];
565  uchar buf[2];
566 
567  if (ch.row() == 0x00 && ch.cell() < 0x80) {
568  // ASCII
569  *cursor++ = ch.cell();
570  } else if ((qt_UnicodeToGbk(ch.unicode(), buf) == 2) &&
571  (buf[0] >= 0xA1) && (buf[1] >= 0xA1)) {
572  *cursor++ = buf[0];
573  *cursor++ = buf[1];
574  } else {
575  // Error
576  *cursor++ = replacement;
577  ++invalid;
578  }
579  }
580  rstr.resize(cursor - (uchar*)rstr.constData());
581 
582  if (state) {
583  state->invalidChars += invalid;
584  }
585  return rstr;
586 }
char * data()
Returns a pointer to the data stored in the byte array.
Definition: qbytearray.h:429
ushort unicode() const
This is an overloaded member function, provided for convenience. It differs from the above function o...
Definition: qchar.h:251
The QByteArray class provides an array of bytes.
Definition: qbytearray.h:135
The QChar class provides a 16-bit Unicode character.
Definition: qchar.h:72
unsigned char uchar
Definition: qglobal.h:994
const char * constData() const
Returns a pointer to the data stored in the byte array.
Definition: qbytearray.h:433
void resize(int size)
Sets the size of the byte array to size bytes.
uchar cell() const
Returns the cell (least significant byte) of the Unicode character.
Definition: qchar.h:283
uchar row() const
Returns the row (most significant byte) of the Unicode character.
Definition: qchar.h:284
int qt_UnicodeToGbk(uint unicode, uchar *gbchar)

◆ convertToUnicode()

QString QGb2312Codec::convertToUnicode ( const char *  chars,
int  len,
ConverterState state 
) const
virtual

QTextCodec subclasses must reimplement this function.

Converts the first len characters of chars from the encoding of the subclass to Unicode, and returns the result in a QString.

state can be 0, in which case the conversion is stateless and default conversion rules should be used. If state is not 0, the codec should save the state after the conversion in state, and adjust the remainingChars and invalidChars members of the struct.

Reimplemented from QGb18030Codec.

Definition at line 473 of file qgb18030codec.cpp.

474 {
475  uchar buf[2];
476  int nbuf = 0;
477  ushort replacement = QChar::ReplacementCharacter;
478  if (state) {
479  if (state->flags & ConvertInvalidToNull)
480  replacement = QChar::Null;
481  nbuf = state->remainingChars;
482  buf[0] = state->state_data[0];
483  buf[1] = state->state_data[1];
484  }
485  int invalid = 0;
486 
487  QString result;
488  result.resize(len);
489  int unicodeLen = 0;
490  ushort *const resultData = reinterpret_cast<ushort*>(result.data());
491  //qDebug("QGb2312Decoder::toUnicode(const char* chars, int len = %d)", len);
492  for (int i=0; i<len; i++) {
493  uchar ch = chars[i];
494  switch (nbuf) {
495  case 0:
496  if (IsLatin(ch)) {
497  // ASCII
498  resultData[unicodeLen] = ch;
499  ++unicodeLen;
500  } else if (IsByteInGb2312(ch)) {
501  // GB2312 1st byte?
502  buf[0] = ch;
503  nbuf = 1;
504  } else {
505  // Invalid
506  resultData[unicodeLen] = replacement;
507  ++unicodeLen;
508  ++invalid;
509  }
510  break;
511  case 1:
512  // GB2312 2nd byte
513  if (IsByteInGb2312(ch)) {
514  buf[1] = ch;
515  int clen = 2;
516  uint u = qt_Gb18030ToUnicode(buf, clen);
517  if (clen == 2) {
518  resultData[unicodeLen] = qValidChar(static_cast<ushort>(u));
519  ++unicodeLen;
520  } else {
521  resultData[unicodeLen] = replacement;
522  ++unicodeLen;
523  ++invalid;
524  }
525  nbuf = 0;
526  } else {
527  // Error
528  resultData[unicodeLen] = replacement;
529  ++unicodeLen;
530  ++invalid;
531  nbuf = 0;
532  }
533  break;
534  }
535  }
536  result.resize(unicodeLen);
537 
538  if (state) {
539  state->remainingChars = nbuf;
540  state->state_data[0] = buf[0];
541  state->state_data[1] = buf[1];
542  state->invalidChars += invalid;
543  }
544  return result;
545 }
#define IsLatin(c)
#define qValidChar(u)
static uint qt_Gb18030ToUnicode(const uchar *gbstr, int &len)
quint16 u
The QString class provides a Unicode character string.
Definition: qstring.h:83
QChar * data()
Returns a pointer to the data stored in the QString.
Definition: qstring.h:710
unsigned char uchar
Definition: qglobal.h:994
unsigned int uint
Definition: qglobal.h:996
void resize(int size)
Sets the size of the string to size characters.
Definition: qstring.cpp:1353
unsigned short ushort
Definition: qglobal.h:995
#define IsByteInGb2312(c)

◆ mibEnum()

int QGb2312Codec::mibEnum ( ) const
inlinevirtual

Subclasses of QTextCodec must reimplement this function.

It returns the MIBenum (see IANA character-sets encoding file for more information). It is important that each QTextCodec subclass returns the correct unique value for this function.

Reimplemented from QGb18030Codec.

Definition at line 95 of file qgb18030codec.h.

95 { return _mibEnum(); }
static int _mibEnum()

◆ name()

QByteArray QGb2312Codec::name ( ) const
inlinevirtual

QTextCodec subclasses must reimplement this function.

It returns the name of the encoding supported by the subclass.

If the codec is registered as a character set in the IANA character-sets encoding file this method should return the preferred mime name for the codec if defined, otherwise its name.

Reimplemented from QGb18030Codec.

Definition at line 94 of file qgb18030codec.h.

94 { return _name(); }
static QByteArray _name()

The documentation for this class was generated from the following files: