January 2006 - Posts
It was bound to happen. IBM has finally release a 'free' version of their DB2 product. I'm going to try and download it (~300MB) to see what it offers and compare it against the other 'free' databases. The only pain is that you have to register with IBM to get to download it. I wonder if they plan to spam me with ads afterwards ;)
http://news.zdnet.com/2100-3513_22-6032676.htmlOracle also release a free version of their database system which I still need to have a look at.
Perhaps one day I'll need a database to keep track of all the free databases... :p
This is probably one of those more humoristic titles in
which you can read more than you should. Actually I think there might be some
truth in it. Most surveys and polls tend to provide only a handful of static
choices and don’t allow people to give a full true opinion anyway.
Read more about this here
It reminds me of another story that went something like this: "87% of all statistics are made up on the start..." or something like that.
Even though I have worked with .Net serialization quite a few times I still discover new things now and then. Today I learned something about how to show/hide values in the generated serialized stream.
Say you have a class like this:
[Serializable]
[XmlType("MyClass")]
public class SomeClass
{
public System.DateTime SomeDate = DateTime.Now;
[System.Xml.Serialization.XmlIgnoreAttribute()]
public bool SomeDateSpecified;
}
It is obvious that the second field would not be included in the serialized stream. However, the first field would also not be included! It seems to be a built in way of implementing 'null' values in serialization. Just changing the first field's name to anything else will have its value included in the output stream - or if you set the "Specified" field value to true.
The second thing I discovered has to do with default values. If you mark a field with a System.ComponentModel.DefaultValueAttribute attribute it will not be included in the seralized stream if you specify a value that equals the default value. Kinda makes sense but sometime you actually might want to use that 'default' value.
The following is just an idea I'm busy working on. It is intended only as a thought process and may change over time.
Artificial Intelligent Documents
An idea for future computing
Created by Rudolf Henning
Date: 25 January 2006
Introduction
This is a concept idea about how software and hardware can
evolve to a level where intelligent entities can be created to gather, store
and manipulate data. The name given to ‘entities’ like these in this document
are simply ‘AIDoc’ – ‘Artificial Intelligent Document Entities’.
AIDoc’s will be hardware and operating system neutral.
What is inside an AIDoc?
AIDoc’s will contain their own ‘executable’ code. It will be
in the form of uncompiled/semi compiled source code. The o/s will have to at
least support functionality so the AIDoc can compile portions of itself that is
executable on the platform. This means that there is no ‘program’ needed to ‘edit’
or work with/on the document contents. It will be self maintaining by
communicating itself with the underlying o/s for functions like persistence,
allocations of hardware resources and a UI.
Interacting with the AIDoc
When an author (human) creates the AIDoc the first time it
will be loosely like a birth of an entity. It will be created from some sort of
template and once alive it can start interacting with the author. Things like
speech recognition would be part of its ‘DNA’ and thus the traditional method
of typing in words of text won’t be necessary anymore. This means that the data
and instructions it gathers from the author would be sound or even video and
the AIDoc will actually understand it. All of this information it gathers will
become part of itself and when it needs to be transferred or persisted this
information will go along.
Since there is no program needed to ‘run’ it, it can be
given a command to transfer itself to another piece of hardware and run of
start living there. Obviously allowing the AIDoc itself to randomly transfer
itself would be equivalent to today’s worms and viruses. Thus be default it
will not be allowed to perform such actions. The process of transfer would be
something like this: First it must be given a command to be transferred and also
where to. The AIDoc will then negotiate with the o/s of the target hardware and
ask permission if it can be hosted of the platform. This will include making
sure the target hardware can support it etc. Once it is allowed to transfer it
will send its blueprint to the new platform to compile a working empty entity
that is compatible on that hardware. Once that is done it can transfer all its
‘knowledge’ or data and verify the progress. Once that is done the old copy can
either self terminate or persists/hibernate for future use on the old hardware.
If ever it needs to be transferred back this can help speeding up the process
of restoring it.
From this concept of transfer it can be deduced that the AIDoc
would be a lot like a living creature. It needs to contain a blueprint of
itself. It only needs to contain a higher level source code of it functionality
since it is not hardware specific – much like the current interpreted
environments like VB, Java or .Net.
Too Intelligent?
To terminate an AIDoc you would have to give it an
instruction to ‘kill’ or terminate itself. This could be a problem if the
entity was truly self aware – and many science fiction stories can explain what
happens further. For that reason it should not be truly self aware.
When an AIDoc is no longer needed it can either be
terminated (kill/delete) or be hibernated for future use to persistent storage.
Even future hardware may have restrictions that it cannot run too many AIDoc’s
at any one time.
Communicating with outside world
Another concept of this entity is that not only will it just
store and organize information in itself but since it is aware of the content
it can – when given the instruction to; gather more information on the relevant
topic from the extended network – i.e. the like Internet.
Say you were creating an essay or school task on ‘Plants in the dinosaur
times’ the AIDoc can be given a command to gather as much information on
the topic and then verbally/visually inform the author and allow him/her to
decide what will be included in the final essay. The important bit here is that
it is not the author that needs to go and search on the Internet to find
relevant information.
Security – hopefully we have learned from past mistakes
Security would be something that must be built in from the
start. As mentioned before the AIDoc by itself should not be allowed to perform
some actions like transferring itself without permissions. By default the
author would be the only person that is allowed to interact with the AIDoc. The
author can give permission that others may view or modify some or all of the AIDoc.
Possibly by default normal authors should also not be allowed to make changes
to the way the AIDoc behave. This should probably be a more
developer/administrator privilege. An AIDoc must be able to detect if there has
been any tampering with itself or the data it contains. If so it must warn the
author and/or administrator.
Self preservation
Self preservation would also have to be built into the AIDoc.
Like explained in the transfer of the AIDoc it must ensure the system is capable
of sustaining it before attempting to run on another piece of hardware. It must
also in a balanced way ensure that the system where it is currently ‘living’
has enough resources to sustain it plus if something goes wrong (even fancy
hardware can physically fail) it has been persisted in some format so it can be
brought back ‘from the dead’.
One thing that should be noticeable now is that the format
of traditional documents would be irrelevant. There would be no specific text,
spreadsheet or image documents. In a sense the AIDoc will contain other
documents or streams of data that it will organize internally – an entire file
system by itself.
Conclusion
Many of the concepts described here are not possible today.
Speech recognition and audio/visual feedback is not really there yet to the
degree that it can be sustained within a separate entity like an AIDoc. Currently
it requires the entire system to perform these functions.
Potentially an AIDoc can become huge by our current
standards. Even when it is young – just born, it already may be much bigger
that any existing system can handle. Keep in mind it will already contain all
the functionality to do things like speech recognition, perhaps visual
recognition, follow instructions intelligently, make some decisions by itself
and being able to understand and talk at least one language.
It seems the best utilities are created when you have a specific need for something you require. I've often needed to both monitor ad-hoc windows services plus wanted to stop or start them as I like. SQL server has its own 'service manager' which is great for quickly stop or start the sql server. It was almost just natural that I wanted to create something similar so I can monitor any other service like I want and have the ability to stop/start them as needed.
Like most of my other 'little' utilities it evolved over time from something I quickly created just for myself to something I can share with others.
This one is called 'Service monitor' and it minimize to the system tray just like the sql server manager. Right-clicking it allows you to stop or start the currently selected service. There is a main window from which you can select multiple services and also stop or start them all. One recent add-on is the ability to select a service and monitor it seperately. This way you can monitor multiple services at once.
Try it out - it's free and simple to use.
It requires the .Net 1.1 framework which hopefully is common enough these days for everyone.
I haven't been very active in the online community lately - the last couple of months. The reasons have got to do with some major personal changes going on in my life. Unfortunately I cannot go into any more details. For those that know me please bear with me and if possible try to understand I don't have anything against any of you. For the others, please continue to ignore me.
Life is not always fair and things don't always work out the way we planned or hoped. That is why they call it 'life'...
It’s interesting how some applications 'evolve' over time based on the technologies available at the time. Take for example this utility or small application I created to track where my files and backups are. It all started out in the days I still did Access development... yes I know that sounds almost like an oxymoron but it is true you can write fully functional applications using Microsoft Access. My initial aim was to create a database of what is where on all my backup disks. If you have dozen of backup disks laying around it gets difficult to remember what files are on which backup.. you all should know the story.
In the beginning it was stiffy disks, sheees, that was ages ago and probably not the wises thing to do - backup on stiffy: guaranteed way to loose stuff *shivers*. Later when CD writing became possible it was a little bit tricky to write down on the disk all the filenames on it.
The initial application required manual input but that was quickly replaced by a routine to 'scan' the cd's and simply log the filenames, sizes and dates. Then VB came along. The first VB version was written in VB5/6. It worked. It wasn't called SoftTrack as such but the basic structures were already there. The Access database now has become simply a data store. Version 1 (sort of)
Then VB.Net 1.0 came along. The VB6 app was simply ported to .Net without any new functionality. It worked too. Version 2.
Then I started working in C# (1.1). So the application was rewritten from the ground up and I still use it as reference. Version 3.
Lastly when C# 2.0 came out I thought it could be a good exercise to rewrite the application again. Again it was rewritten from the ground up. Some minor features dropped and some added. Version 4
It seems that I started using this application as a learning tool for each time a new level of technology or platform came out. Personally I think this is a great way to learn new things by actually working with it and create something useful. Writing one or two liner applications don't teach you all the kirks and advantages of a new environment. In the end you sit with something useful that can actually be used as a reference for writing other applications. That does not always mean a good reference. What I mean is that often you find out there were better ways of doing things you didn't know at the time. Looking back at the source code of the earlier versions of my app I often think 'what the hell was I thinking doing that...". I'm sure I'll do it again in the future with the now 'new' version.
Are there any plans for updates to my application as it is? For sure! I seriously considered using SQLExpress as the data store and will probably eventually go that route. The reason I haven't done it yet is for portability. With this I mean being able to simply copy the data file to another machine without having any dependencies on services or other stuff. That is still one of Jet's biggest advantages. The application as it is (since version 3) can actually create the mdb database from scratch - literally. I built a module that simply takes the sql statements to create a database and run them in the application, creating everything from scratch. Thus you can copy the exe to a new machine (just having the .Net framework) plus one proxy dll and run it creating a new database.
People interested in this application are welcome to contact me. Perhaps I should expose it as an open source product. I use it all the time myself and can imagine it must be useful to others too.
After some more playing around and updating the test app I discovered its not all that obvious which database system beats the others in every case. It seems each has strong and weak points and the results are not always as expected. The surprising bit is that Access (Jet) outperform MySql and SQLExpress in several types of queries. These tend to be the more simplistic type of select statements. Of couse, all tests were performed 'local' thus Jet would probably benefit from the fewer application layers it needs to go through. The situation change however as the queries become more complex.
As far as MySql and SQLExpress goes I could not find one real situation where MySql gets on top without some doubt. The only test where it won was one of the simple queries where I have serious doubt the test is valid.
One very surprising thing I found and could almost not believe myself was with what we call 'full text search' on a varchar type field. Jet beat the *sensored* out of both the its bigger brother and opponent. I ran the test several times to make sure. I suspect Jet was more optimized for those general string type of searches inside a limited size string (remember strings in Jet can only go up to 255 characters). The table used for the test had over 900 000 rows and the field was NOT indexed in all the databases.
The following is a summary of the results. The numbers are measured in milliseconds (average). Numbers in italic are 'not trusted' as they might be measuring errors being too short for the way I tried measuring times. These numbers should not be used as absolutes but rather as relative indicators. Things like other background processes could have 'stolen' a few milliseconds here and there.
Test description
|
Access (Jet 4.0)
|
MySql 5.0
|
SQL Express |
| Simple count select |
47 |
1516 |
200 |
| Select large table |
31 |
31 |
62 |
| select with where on Key |
16 |
47 |
140 |
| select with
where on number field (no index) |
200 |
2547 |
94 |
| select
with where on varchar matching whole field like 'some' |
16 |
2344 |
200 |
| select
with where on varchar beginning of field like 'some%' |
78 |
2300 |
234 |
| select with where on varchar any part of field like '%some%' |
47 |
5780 |
3200 |
| select with join and where ... |
16 |
172 |
109 |
| select with 2 table joins |
3000 |
5300 |
200 |
| select with 3 table joins |
16 |
5546 |
3313 |
| select with group by |
6750 |
4100 |
438 |
| select sum of big table |
782 |
1672 |
266 |
All tests were 'ad hoc/free text' queries. I still have to try and do stored procedures. For those that don't know it Jet supports stored procs too (they call them queries).
SummaryIn the end I have to summarize the results like this. When using simple queries it does not seem to matter which database system you use as far as performance goes. The architecture of the total application is more important then.
As SQL query complexity increase it seems the bigger guns are more optimized to handle parsing and executing statements. If performance is your main requirement I would opt for SQLExpress (or full SQL server 2005). If cross platform compatibility or pricing is you main concern then MySql would do the job. No surprises here.
As far as ease of use goes there is no clear winner. MySql has grown up.
A few days ago I blogged about the updated SQL books online (December 2005). Unfortunately I discovered it completely breaks your 'Managment Studio Express CPT' and the only way to fix it is by renaming/deleting a registry key. Really stupid if you ask me!
To get 'Managment Studio Express' working again you have to delete or rename the registry key [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\Tools\Shell].
This effectively break SQL books online for SQL Server Express again. To restore it you have to replace the registry key. It means you cannot use both application at the same time.
This sucks heavily!! Please fix it Microsoft!
After starting to play with MySql I couldn't help wondering how it would compare to SQLExpress. Sure, against Access (JET) it was faster - in most cases by is good deal. So I took a 'small' database I developed in Access (and still use it as the plain data store) and exported the data to both MySql and SQLExpress. One table in particular has over 1 million rows which make for some proper large queries and result sets.
I used exactly the same code line by line with only the specific namespace objects different - i.e. use MySqlConnection vs. SqlConnection, MySqlCommand vs. SqlCommand etc.
The front end app is a simple Windows form app in .Net 1.1 for now. I will try the same with .Net 2.0 as well. The app simply has one text box for the query, one text box for the results, one label for the timing results and two buttons - one for MySql and the other for SQLExpress.
For MySql I'm using the 'MySQL Connector Net 1.0.7' library. For SQLExpress I simply used the built in System.Data.SqlClient.
Using this I can run the same query against both databases and see the time difference. It is simply a way to ensure both database systems use exactly the same tsql so I can compare the results.
What I found was that in most case SQLExpress was almost twice as fast as MySql. I did not do installation or configuration optimizations for either db system - everything was left stock standard. Whether this means SQLExpress is always faster then MySql I don't know but for the 'ad hoc' queries I tried it made a big difference in response time. I tried simple queries and complex joins between multiple tables with wild card like where clauses.
Other factors I still need to compare are resource utilization and things like ease of use. SQLExpress seems to allocate more memory or at least hang on more to it. Then again nothing else is using the system so the operating system won't deallocate the memory just for that. For ease of use they seem pretty close given that MySql now has some really nice UI admin tools. For SQLExpress I use the 'SQLExpress Management Studio Express' which is a beta.
One big benefit MySql still has over any Microsoft product is that it is truly platform independant - it can run on Linux, Unix etc. Thus porting your data is as easy as just copying the data files or doing a backup/restore. However upsizing it to an enterprise system is not that easy. If you want to use it in a big enterprise system you need to convert your data to something like DB2, Oracle, Sybase or.... SQL Server. With SQLExpress you can just copy the data files to a enterprise SQL server (provided you only using a Windows environment)
[Update: forgot to mention that I'm using MySql 5.0 for these tests. Also, when doing multi-connection queries MySql seems to be hit harder than SQLExpress. I've now created a C# 2.0 test app and the timings seems to be exactly the same for both. At least its not worse.]
Today I encountered a problem with VS 2003 that seems to be not too uncommon when doing certain things with .Net. I'm not exactly sure yet what caused the problem. It could be some shareware thingy I downloaded to evaluate or the mysql installation. Those two things are the only things that changed recently on my machine. I hope (kinda) that is was just the stupid shareware utility which I removed anyway. Otherwise it could mean some problems for the people wanting to use mysql and .Net.
But at least the fix seems to be easy. Thanks to Google *all hail the big search engine people* I found a link on some forum where people asked the same question. This link proofed the most help some.
It seems the scripting engine that the VS IDE used got corrupted or something. Reinstalling the version 5.6 scripting engine and re-registering two files (dte.olb and vslangproj.tlb) corrected the problem.
Yes, for those that still remember what this is about there is still 'life' in this idea. Over the holiday season I had some time to think about possibilities on how to get the project of the ground or other ways to go about it. Then I thought about the .Net juniors in which we discussed a possible more practical project to learn more about .Net coding. You can possibly see where I'm going with this. However, this project will not only be for juniors. There is also an aspect of 'users' here. These will be people that will 'set' the requirements. I'm going to be one of them. There are still lots of things that need to be thought through but perhaps I'm on to something here.
The basic idea is to look for two groups of people - but they may be just one group with some people specializing more in one area than the other - users-developers.
One big thing I'm still contemplating is whether this should be a web base technology project at all or not. For now at least know this: the idea is not dead - just postpone a bit. I'd like to hear from people interested.
It's been a while (several years) since I last played with any version of mysql and my experience then wasn't so good then. Today I had to look at it again for some family project and the downloaded 4.1 version since that is closer to the version available on the server that is available.Perhaps one day I'll try 5.0 as well.
Anyway, the requirement was to convert an old Access database to mysql so it can be hosted on a web server. Exporting the data was the easy part. The problem part was the fact that the ODBC driver for mysql seems to ignore or throw away the meta data of the table being imported. None of the 'not null' value, keys, indexes etc were imported. The result was I had to recreate these by hand.
Still, the mysql database server and the admin UI they have available on the mysql site seems good. One interesting thing was that the backups are in sql form - that is, when you back up the database it is just a bunch of tsql statements that you can 'replay' on the onther side to restore the database. That's really cool.
I think I'll play a bit more with it as time permit. I'd like to see how it compare with SQL express in terms of performance, resource footprint etc.
Interestingly enough they both seem to have a similar memory footprint when idling...