info guile

6.6.6.6 Interpreting Bytevector Contents as Unicode Strings

Bytevector contents can also be interpreted as Unicode strings encoded in one of the most commonly available encoding formats. See section Representing Strings as Bytes, for a more generic interface.

(utf8->string (u8-list->bytevector '(99 97 102 101)))
⇒ "cafe"

(string->utf8 "café") ;; SMALL LATIN LETTER E WITH ACUTE ACCENT
⇒ #vu8(99 97 102 195 169)

Scheme Procedure: string->utf8 str
Scheme Procedure: string->utf16 str [endianness]
Scheme Procedure: string->utf32 str [endianness]
C Function: scm_string_to_utf8 (str)
C Function: scm_string_to_utf16 (str, endianness)
C Function: scm_string_to_utf32 (str, endianness): Return a newly allocated bytevector that contains the UTF-8, UTF-16, or UTF-32 (aka. UCS-4) encoding of str. For UTF-16 and UTF-32, endianness should be the symbol big or little; when omitted, it defaults to big endian.

Scheme Procedure: utf8->string utf
Scheme Procedure: utf16->string utf [endianness]
Scheme Procedure: utf32->string utf [endianness]
C Function: scm_utf8_to_string (utf)
C Function: scm_utf16_to_string (utf, endianness)
C Function: scm_utf32_to_string (utf, endianness): Return a newly allocated string that contains from the UTF-8-, UTF-16-, or UTF-32-decoded contents of bytevector utf. For UTF-16 and UTF-32, endianness should be the symbol big or little; when omitted, it defaults to big endian.

This document was generated on April 20, 2013 using texi2html 5.0.