A basic iconv example in C
Januar 13th, 2011
This example converts ISO-8859-1 to UTF
#include <stdlib.h>
#include <errno.h>
#include <iconv.h>
#include <stdio.h>
int main (int argc, char *argv[]) {
// create transcoder from iso-8859-1 to utf8
iconv_t foo = iconv_open("UTF-8", "ISO-8859-1");
if((int) foo == -1) {
if (errno == EINVAL) {
fprintf(stderr,
"Conversion is not supported");
} else {
fprintf(stderr, "Initialization failure:\n");
}
}
// calloc fills memory with 0 bytes. we alloc two -
// one for the 'ö' and one for the ending delimeter
char *iso = calloc(2, sizeof(char));
iso[0] = 0xF6; // iso-8859-1 'ö'
// the converted string can be four times larger
// then the original, as the largest known char width is 4 bytes.
char *converted = calloc(5, sizeof(char));
// we need to store an additional pointer that targets the
// start of converted. (iconv modifies the original 'converted')
char *converted_start = converted;
size_t ibl = 2; // len of iso
size_t obl = 5; // len of converted
// do it!
int ret = iconv(foo, &iso, &ibl, &converted, &obl);
// if iconv fails it returns -1
if(ret == (iconv_t)-1) {
perror("iconv");
iconv_close(foo);
return 1;
} else {
// other wise the number of converted bytes
printf("%i bytes converted\n", ret);
printf("result: '%s'\n", converted_start);
iconv_close(foo);
return 0;
}
}
März 3rd, 2014 at 13:07
Hey,
thanks for this example, it helped me using iconv w/out putting much time into it. But one little correction (I know, this post is 3 years old, but google doesn’t care about that :)):
iconv does NOT return the number of converted bytes. Instead, it returns the number of CHARACTERS converted IRREVERSIBLY.
That means, if you are converting e.g. from UTF-16 to UTF-8 which is always reversible, you will always get a return value of 0.
What you can do to get the number of bytes converted would be the following:
…
size_t obl = 5;
size_t obl_start=obl;
…
…conversation…
…
size_t converted_bytes=obl_start=obl;
Greets
März 4th, 2014 at 02:17
Many thanks for pointing this out! I’ll update the post as soon as possible.
April 20th, 2015 at 18:52
Need to change the following:
include <stdio.h>
for fprintf() resolution
and
if((int) foo == -1) {
should be
if( foo == (iconv_t)-1) {
Otherwise, thanks for the example :>
April 20th, 2015 at 20:41
Hi Michael,
thanks for your feedback! I’ve added that.
Have a nice day,
Thorsten
Juni 18th, 2015 at 18:49
Hi Thorsten,
One query here. so if I would like to convert a char string to UTF8 String , say for example a char string of length 11 is to be converted I should be allocating sizes as below
size_t ibl = 12; // len of char string
size_t obl = 45; // allocation for converted string
Please let me know if my understanding is correct ?
Juni 19th, 2015 at 09:13
Hi,
I assume that the string is zero terminated. In this case 45 bytes is correct:
11 input chars (12 – 0 byte at the end)
4 times, since the max char width is 4 bytes in utf8 = 44 bytes
+1 for the zero byte at the end = 45.
Looks good!
Juli 7th, 2015 at 08:57
Hi Thorsten,
One query here
After the iconv_close(foo); statement
I want to do
free(iso) // This is causing core dump
free(converted) // This is working fine
iso = NULL;
converetd = NULL;
converted_start = NULL:
what I noticed is on doing fee(converted) program works fine. however fee(iso) fails. So is this being done by
iconv implicitly ?.
Thanks.
Juli 7th, 2015 at 09:08
One correction here
free(converted_start) // This works fine
for iso and converted pointers it fails.
So are both being free impolicitly. Or am I understanding incorrectly.
Thanks.