g_cs is a Chinese word(“你好” which means hello) encoded in UTF-8. The code works under both Windows(WinXP+VS2005) and Linux(Ubuntu12.04+gcc4.6). You may wanna open a.txt to check whether the string is correctly written.
NOTE: Under Linux, we print the string directly since the default console encoding is UTF-8, and we can view the string. While under Window, the console DOES NOT support UTF-8(codepage 65001) encoding. Printing to it simply causes typo. We just convert it to a std::wstring and use MessageBox() API to check the result. I will cover the encoding issue in windows console in my next post, maybe.
I began to investigate the problem, since I cannot find a solution to read/write a UTF-8 string to XML file using boost::property_tree. Actually, it’s a bug and is already fixed in boost 1.47 and later versions. Unfortunately, Ubuntu 12.04 came with boost 1.46.1. When reading non-ASCII characters, some bytes are incorrectly skipped. The failure function is boost::property_tree::detail::rapidxml::internal::get_index(). My test code looks like:
Generally, A logger is a singleton class. The declaration may look like:
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#ifndef _LOGGER_H
#define _LOGGER_H
#include <string>
classLogger
{
private:
Logger(){}
public:
staticvoidInit(conststd::string&name);
staticLogger*GetInstance();
voidWrite(constchar*format,...);
private:
staticstd::stringms_name;
staticLogger*ms_this_logger;
};
#endif
The Init function is used to set log name or maybe other configuration information. And We can use the Write function to write logs.
Well, in a multithreaded environment, locks must be added to prevent concurrent issues and keep the output log in order. And sometimes we want to have separate log configurations. How can we implement it without breaking the original interfaces?
One easy way is to maintain a list of all available Logger instances, so that we can find and use a unique Logger in each thread. The approach is somehow like the one used in log4j. But log4j reads configuration files to initialize loggers, while our configuration information is set in runtime.
Another big issue is that we must add a new parameter to the GetInstance function to tell our class which Logger to return. The change breaks interfaces.
By utilizing TLS (thread-local storage), we can easily solve the above issues. Every logger will be thread-local, say every thread has its own logger instance which is stored in its thread context. Here comes the declaration for our new Logger class, boost::thread_specific_ptr from boost library is used to simplify our TLS operations:
A smart pointer is an abstract data type that simulates a pointer while providing additional features, such as automatic garbage collection or bounds checking. There’s auto_ptr in C++03 library for general use. But it’s not so easy to deal with it. You may encounter pitfalls or limitations. The main drawback of auto_ptr is that it has the transfer-of-ownership semantic. I just walk through it. Please read comments in code carefully:
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
int*test_auto_ptr_exp(){
auto_ptr<int>p(newint(1));
throwruntime_error("auto_ptr test exception.");
/* exception-safe, p is free even when an exception is thrown. */
returnp.get();
}
voidtest_auto_ptr_basic(){
auto_ptr<int>p1(newint(1));
auto_ptr<int>p2(newint(2));
auto_ptr<int>p3(p1);
auto_ptr<int>p4;
p4=p2;
if(p1.get()){/* NULL */
cout<<"*p1="<<*p1<<endl;
}
if(p2.get()){/* NULL */
cout<<"*p2="<<*p2<<endl;
}
if(p3.get()){/* ownership already transferred from p1 to p3 */
cout<<"*p3="<<*p3<<endl;
}
if(p4.get()){/* ownership already transferred from p2 to p4 */
cout<<"*p4="<<*p4<<endl;
}
/* ERROR: void is a type of template specialization */
//auto_ptr<void> ptr5(new int(3));
}
voidtest_auto_ptr_errors(){
/* ERROR: statically allocated object */
constchar*str="Hello";
auto_ptr<constchar>p1(str);
/* ERROR: two auto_ptrs refer to the same object */
int*pi=newint(5);
auto_ptr<int>p2(pi);
auto_ptr<int>p3(p2.get());
p2.~auto_ptr();/* now p3 is not available too */
/* ERROR: hold a pointer to a dynamically allocated array */
/* When destroyed, it only deletes first single object. */
auto_ptr<int>(newint[10]);
/* ERROR: store an auto_ptr in a container */
//vector<auto_ptr<int> > vec;
//vec.push_back(auto_ptr<int>(new int(1)));
//vec.push_back(auto_ptr<int>(new int(2)));
//auto_ptr<int> p4(vec[0]); /* vec[0] is assigned NULL */
//auto_ptr<int> p5;
//p5 = vec[1]; /* vec[1] is assigned NULL */
}
3. unique_ptr
To resolve the drawbacks, C++0x deprecates usage of auto_ptr, and unique_ptr is the replacement. unique_ptr makes use of a new C++ langauge feature called rvalue reference which is similar to our current (left) reference (&), but spelled (&&). GCC implemented this feature in 4.3, but unique_ptr is only available begin from 4.4.
What is rvalue?
rvalues are temporaries that evaporate at the end of the full-expression in which they live (“at the semicolon”). For example, 1729, x + y, std::string(“meow”), and x++ are all rvalues.
While, lvalues name objects that persist beyond a single expression. For example, obj, *ptr, ptr[index], and ++x are all lvalues.
NOTE: It’s important to remember: lvalueness versus rvalueness is a property of expressions, not of objects.
We may have another whole post to address the rvalue feature. Now, let’s take a look of the basic usage. Please carefully reading the comments:
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
unique_ptr<int>get_unique_ptr(inti){
returnunique_ptr<int>(newint(i));
}
voiduse_unique_ptr(unique_ptr<int>p){
/* p is deleted when finish running this function. */
}
voidtest_unique_ptr_basic(){
unique_ptr<int>p(newint(1));
/*
* One can make a copy of an rvalue unique_ptr.
* But one can not make a copy of an lvalue unique_ptr.
* Note the defaulted and deleted functions usage in source code(c++0x).
*/
//unique_ptr<int> p2 = p; /* error */
//use_unique_ptr(p); /* error */
use_unique_ptr(move(p));
use_unique_ptr(get_unique_ptr(3));
}
One can ONLY make a copy of an rvalue unique_ptr. This confirms no ownership issues occur like that of auto_ptr. Since temporary values cannot be referenced after the current expression, it is impossible for two unique_ptr to refer to a same pointer. You may also noticed the move function. We will also discuss it in a later post.
/* allow void pointer, but a custom deleter must be used. */
unique_ptr<void,aclass_deleter>p3(newaclass);
}
unique_ptr can hold pointers to an array. unique_ptr defines deleters to free memory of its internal pointer. There are pre-defined default_deleter using delete and delete[](array) for general deallocation. You can also define your customized ones. In addition, a void type can be used.
NOTE: To compile the code, you must specify the -std=c++0x flag.
4. shared_ptr
A shared_ptr is used to represent shared ownership; that is, when two pieces of code needs access to some data but neither has exclusive ownership (in the sense of being responsible for destroying the object). A shared_ptr is a kind of counted pointer where the object pointed to is deleted when the use count goes to zero.
Following snippet shows the use count changes when using shared_ptr. The use count changes from 0 to 3, then changes back to 0:
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
structbclass{
inti;
bclass(inti){this->i=i;}
virtual~bclass(){cout<<"in bclass::dtor() with i="<<i<<endl;}
};
structcclass:bclass{
cclass(inti):bclass(i){}
virtual~cclass(){cout<<"in cclass::dtor() with i="<<i<<endl;}
};
voiduse_shared_ptr(shared_ptr<int>p){
cout<<"count="<<p.use_count()<<endl;
}
voidtest_shared_ptr_basic(){
shared_ptr<int>p;
cout<<"count="<<p.use_count()<<endl;
p.reset(newint(1));
cout<<"count="<<p.use_count()<<endl;
shared_ptr<int>p2=p;
cout<<"count="<<p.use_count()<<endl;
use_shared_ptr(p2);
cout<<"count="<<p.use_count()<<endl;
p2.~shared_ptr();
cout<<"count="<<p.use_count()<<endl;
p2.~shared_ptr();
cout<<"count="<<p.use_count()<<endl;
}
Snippets showing pointer type conversion:
C++
1
2
3
4
5
6
7
8
9
10
voidtest_shared_ptr_convertion(){
/* p is deleted accurately without custom deleter */
The void type can be used directly without a custom deleter, which is required in unique_ptr. Actually, shared_ptr has already save the exact type info in its constructor. Refer to source code for details :). And static_pointer_cast function is used to convert between pointer types.
Unlike auto_ptr, Since shared_ptr can be shared, it can be used in STL containers:
NOTE: shared_ptr is available in both TR1 and Boost library. You can use either of them, for their interfaces are compatible. In addition, there are dual C++0x and TR1 implementation. The TR1 implementation is considered relatively stable, so is unlikely to change unless bug fixes require it.
5. weak_ptr
weak_ptr objects are used for breaking cycles in data structures. See snippet:
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
structmynode{
inti;
shared_ptr<mynode>snext;
weak_ptr<mynode>wnext;
mynode(inti){this->i=i;}
~mynode(){cout<<"in mynode::dtor() with i="<i<endl;}
};
voidtest_weak_ptr(){
shared_ptr<mynode>head(newmynode(1));
head->snext=shared_ptr<mynode>(newmynode(2));
/* use weak_ptr to solve cyclic dependency */
//head->snext = head;
head->wnext=head;
}
If we use uncomment to use shared_ptr, head is not freed since there still one reference to it when exiting the function. By using weak_ptr, this code works fine.
6. scoped_ptr
scoped_ptr template is a simple solution for simple needs. It supplies a basic “resource acquisition is initialization” facility, without shared-ownership or transfer-of-ownership semantics.
This class is only available in Boost. Since unique_ptr is already there in C++0x, this class may be thought as redundant. Snippet is also simple:
C++
1
2
3
4
voidtest_scoped_ptr(){
/* simple solution for simple needs */
scoped_ptr<aclass>p(newaclass);
}
Complete and updated code can be found on google code host here. I use conditional compilation to swith usage between TR1 and Boost implementation in code. Hope you find it useful.
Let clarify some concepts first. What is C++0x? Wikipedia gives some overview here:
C++0x is intended to replace the existing C++ standard, ISO/IEC 14882, which was published in 1998 and updated in 2003. These predecessors are informally but commonly known as C++98 and C++03. The new standard will include several additions to the core language and will extend the C++ standard library, incorporating most of the C++ Technical Report 1 (TR1) libraries — with the exception of the library of mathematical special functions.
The aim is for the ‘x’ in C++0x to become ‘9’: C++09, rather than (say) C++0xA (hexadecimal :-).
You may also noticed TR1, also refer here in Wikipedia:
C++ Technical Report 1 (TR1) is the common name for ISO/IEC TR 19768, C++ Library Extensions, which is a document proposing additions to the C++ standard library. The additions include regular expressions, smart pointers, hash tables, and random number generators. TR1 is not a standard itself, but rather a draft document. However, most of its proposals are likely to become part of the next official standard.
You got the relationship? C++0x is the standard adding features to both language and standard library. A large set of TR1 libraries and some additional libraries. For instance, unique_ptr is not defined in TR1, but is included in C++0x.
As of 12 August 2011, the C++0x specification has been approved by the ISO.
Another notable concept is the Boost library. It can be regarded as a portable, easy-to-use extension to the current C++03 standard library. And some libraries like smart pointers, regular expressions have already been included in TR1. You can find license headers regarding the donation of the boost code in libstdc++ source files. While in TR2, some more boost code are to be involved.
TR1 libraries can be accessed using std::tr1 namespace. More info on Wikipedia here:
Various full and partial implementations of TR1 are currently available using the namespace std::tr1. For C++0x they will be moved to namespace std. However, as TR1 features are brought into the C++0x standard library, they are upgraded where appropriate with C++0x language features that were not available in the initial TR1 version. Also, they may be enhanced with features that were possible under C++03, but were not part of the original TR1 specification.
The committee intends to create a second technical report (called TR2) after the standardization of C++0x is complete. Library proposals which are not ready in time for C++0x will be put into TR2 or further technical reports.
The article seems to be a bit too long so far, I decide to give my snippets in a later post.