A basic iconv example in C

Januar 13th, 2011

This example converts ISO-8859-1 to UTF

#include <stdlib.h>
#include <errno.h>
#include <iconv.h>
#include <stdio.h>


int main (int argc, char *argv[]) {

    // create transcoder from iso-8859-1 to utf8
    iconv_t foo = iconv_open("UTF-8", "ISO-8859-1");
    if((int) foo == -1) {
        if (errno == EINVAL) {
            fprintf(stderr,
                "Conversion is not supported");
        } else {
            fprintf(stderr, "Initialization failure:\n");
        }   
    }   
    // calloc fills memory with 0 bytes. we alloc two -
    // one for the 'ö' and one for the ending delimeter
    char *iso = calloc(2, sizeof(char));
    iso[0] = 0xF6; // iso-8859-1 'ö'

    // the converted string can be four times larger
    // then the original, as the largest known char width is 4 bytes.    
    char *converted = calloc(5, sizeof(char));

    // we need to store an additional pointer that targets the
    // start of converted. (iconv modifies the original 'converted')
    char *converted_start = converted;

    size_t ibl = 2; // len of iso
    size_t obl = 5; // len of converted

    // do it!
    int ret = iconv(foo, &iso, &ibl, &converted, &obl);

    // if iconv fails it returns -1
    if(ret == (iconv_t)-1) {
        perror("iconv");
        iconv_close(foo);
        return 1;
    } else {
        // other wise the number of converted bytes
        printf("%i bytes converted\n", ret);
        printf("result: '%s'\n", converted_start);
        iconv_close(foo);
        return 0;
    }   
}

8 Responses to “A basic iconv example in C”

  1. F.Satzger Says:

    Hey,

    thanks for this example, it helped me using iconv w/out putting much time into it. But one little correction (I know, this post is 3 years old, but google doesn’t care about that :)):

    iconv does NOT return the number of converted bytes. Instead, it returns the number of CHARACTERS converted IRREVERSIBLY.

    That means, if you are converting e.g. from UTF-16 to UTF-8 which is always reversible, you will always get a return value of 0.

    What you can do to get the number of bytes converted would be the following:


    size_t obl = 5;
    size_t obl_start=obl;

    …conversation…

    size_t converted_bytes=obl_start=obl;

    Greets

  2. thorsten Says:

    Many thanks for pointing this out! I’ll update the post as soon as possible.

  3. Michael Leib Says:

    Need to change the following:

    include <stdio.h>

    for fprintf() resolution

    and

    if((int) foo == -1) {

    should be

    if( foo == (iconv_t)-1) {

    Otherwise, thanks for the example :>

  4. thorsten Says:

    Hi Michael,

    thanks for your feedback! I’ve added that.

    Have a nice day,
    Thorsten

  5. Sand1988 Says:

    Hi Thorsten,

    One query here. so if I would like to convert a char string to UTF8 String , say for example a char string of length 11 is to be converted I should be allocating sizes as below

    size_t ibl = 12; // len of char string
    size_t obl = 45; // allocation for converted string

    Please let me know if my understanding is correct ?

  6. thorsten Says:

    Hi,

    I assume that the string is zero terminated. In this case 45 bytes is correct:

    11 input chars (12 – 0 byte at the end)

    4 times, since the max char width is 4 bytes in utf8 = 44 bytes

    +1 for the zero byte at the end = 45.

    Looks good! :)

  7. Sand1988 Says:

    Hi Thorsten,

    One query here
    After the iconv_close(foo); statement

    I want to do
    free(iso) // This is causing core dump
    free(converted) // This is working fine
    iso = NULL;
    converetd = NULL;
    converted_start = NULL:
    what I noticed is on doing fee(converted) program works fine. however fee(iso) fails. So is this being done by
    iconv implicitly ?.

    Thanks.

  8. Sand1988 Says:

    One correction here

    free(converted_start) // This works fine

    for iso and converted pointers it fails.
    So are both being free impolicitly. Or am I understanding incorrectly.

    Thanks.

Leave a Reply