blogjava-凯发k8网页登录

在传输层上压缩webservice的请求和响应

黑灵 — sun, 23 jun 2013 13:45:00 gmt

阅读全文

黑灵 2013-06-23 21:45

]]>

在mongodb里实现循环序列功能

黑灵 — fri, 26 apr 2013 14:57:00 gmt

环境是这样的：服务器是用java做的，数据库是mongodb

需求是这样的：我们的系统里要生成一个唯一id，前面的部分有一定的格式，并和时间关联，精确到微秒，考虑到同一微秒内有可能存在并发情况，所以后面在加两位序列号，系统需要定义为1毫秒内的并发小于100个，所以后面两位就够用了。 java服务器端有多台机器都可以用来生成这个唯一id，所以需要在不同的机器上不能生成相同的序列号，所以需要在某一点上做全局的范围同步来保存这序列号的唯一性。其实如果不考虑需求里的唯一id是有一定意义的格式的，用uuid或mongodb的objectid都是更好的选择，完全不需要在某一点上进行同步，性能会更好。

这个可以生成序列号的点，我们可以做一个序列号生成服务器来对应，也可以用数据库来对应。单单为这个简单的功能准备一个服务器来做显然不合适。但是我们用的mongodb并没有类似于mysql或oracle中的select for update这样的锁机制。所以没有办法简单的对这个序列号做原子操作。但是mongodb的对单个document进行update操作中有很是具有原子性的，例如

$set
$unset
$inc
$push
$pushall
$pull
$pullall

我们可以利用这些原子操作，在数据库层以乐观锁的形式来实现循环序列字段。为了方便调用我把这段逻辑做成数据库中的javascript函数。类似与mysql中的存储过程。

首先我们需要一个collection来存放序列号，并对需要的需要的序列号进行初始化。我们叫它counters。

js代码

db.counters.save({_id:"serialno1", val:0, maxval:99})

然后我们想system.js里添加一个javascript函数

js代码

db.system.js.save({_id:"getnextuniqueseq",
value:function (keyname) {
var seqobj = db.counters.findone({_id:keyname});
if (seqobj == null) {
print("can not find record with key: " keyname);
return -1;
}
// the max value of sequence
var maxval = seqobj.maxval;
// the current value of sequence
var curval = seqobj.val;
while(true){
// if curval reach max, reset it
if(curval >= maxval){
db.counters.update({_id : keyname, val : curval}, { $set : { val : 0 }}, false, false);
var err = db.getlasterrorobj();
if( err && err.code ) {
print( "unexpected error reset data: " tojson( err ) );
return -2;
} else if (err.n == 0){
// fail to reset value, may be reseted by others
print("fail to reset value: ");
}
// get current value again.
seqobj = db.counters.findone({_id:keyname});
maxval = seqobj.maxval;
curval = seqobj.val;
continue;
}
// if curval not reach the max, inc it;
// increase
db.counters.update({_id : keyname, val : curval}, { $inc : { val : 1 }}, false, false);
var err = db.getlasterrorobj();
if( err && err.code ) {
print( "unexpected error inc val: " tojson( err ) );
return -3;
} else if (err.n == 0){
// fail to reset value, may be increased by others
print("fail to inc value: ");
// get current value again.
seqobj = db.counters.findone({_id:keyname});
maxval = seqobj.maxval;
curval = seqobj.val;
continue;
} else {
var retval = curval 1;
print("success to get seq : " retval);
// increase successful
return retval;
}
}
}
});

上面这段会把指定的序列号的val值 1，如果val达到上限则从0开始。所以叫循环序列。

其实上面的实现在原理上和java里的atomicinteger系列的功能实现是类似的，利用循环重试和原子性的cas来实现。这种实现方式在多线程的环境里由于锁（monitor）的范围很小，所以并发性上比排他锁要好一些。

下面我们用java来测试一下这个函数的正确性。即在多线程的情况下会不会得到重复的序列号。

第一个测试，val=0， maxval=2000， java端20个线程每个线程循环调用100次。共2000次。所以正确的情况下，从0到1999应该每个数字只出现一次。

java代码

@test
public void testgetnextuniqueseq1() throws exception {
final int thread_count = 20;
final int loop_count = 100;
mongo mongoclient = new mongo("172.17.2.100", 27017);
db db = mongoclient.getdb("im");
db.authenticate("imadmin", "imadmin".tochararray());
basicdbobject q = new basicdbobject();
q.put("_id", "unique_key");
basicdbobject upd = new basicdbobject();
basicdbobject set = new basicdbobject();
set.put("val", 0);
set.put("maxval", thread_count * loop_count);
upd.put("$set", set);
db.getcollection("counters").update(q, upd);
thread[] threads = new thread[thread_count];
final int[][] results = new int[thread_count][loop_count];
for (int i = 0; i < thread_count; i ) {
final int temp_i = i;
threads[i] = new thread("" i) {
@override
public void run() {
try {
mongo mongoclient = new mongo("172.17.2.100", 27017);
db db = mongoclient.getdb("im");
db.authenticate("imadmin", "imadmin".tochararray());
for (int j = 0; j < loop_count; j ) {
object result = db.eval("getnextuniqueseq(\"unique_key\")");
system.out.printf("thread %s, seq=%d\n", thread.currentthread().getname(), ((double) result).intvalue());
results[temp_i][j] = ((double) result).intvalue();
}
} catch (unknownhostexception e) {
e.printstacktrace();
}
}
};
}
for (thread thread : threads) {
thread.start();
}
for (thread thread : threads) {
thread.join();
}
for (int num = 1; num <= loop_count * thread_count; num ) {
// every number appear 1 times only!
int times = 0;
for (int j = 0; j < thread_count; j ) {
for (int k = 0; k < loop_count; k ) {
if (results[j][k] == num)
times ;
}
}
assertequals(1, times);
}
}

然后我们再测试一下循环的情况。 val=0, maxval=99。同样是java端20个线程每个线程循环调用100次。共2000次。这次从0到99的数字每个应该取得20次。

java代码

@test
public void testgetnextuniqueseq2() throws exception {
final int thread_count = 20;
final int loop_count = 100;
mongo mongoclient = new mongo("172.17.2.100", 27017);
db db = mongoclient.getdb("im");
db.authenticate("imadmin", "imadmin".tochararray());
basicdbobject q = new basicdbobject();
q.put("_id", "unique_key");
basicdbobject upd = new basicdbobject();
basicdbobject set = new basicdbobject();
set.put("val", 0);
set.put("maxval", loop_count);
upd.put("$set", set);
db.getcollection("counters").update(q, upd);
thread[] threads = new thread[thread_count];
final int[][] results = new int[thread_count][loop_count];
for (int i = 0; i < thread_count; i ) {
final int temp_i = i;
threads[i] = new thread("" i) {
@override
public void run() {
try {
mongo mongoclient = new mongo("172.17.2.100", 27017);
db db = mongoclient.getdb("im");
db.authenticate("imadmin", "imadmin".tochararray());
for (int j = 0; j < loop_count; j ) {
object result = db.eval("getnextuniqueseq(\"unique_key\")");
system.out.printf("thread %s, seq=%d\n", thread.currentthread().getname(), ((double) result).intvalue());
results[temp_i][j] = ((double) result).intvalue();
}
} catch (unknownhostexception e) {
e.printstacktrace();
}
}
};
}
for (thread thread : threads) {
thread.start();
}
for (thread thread : threads) {
thread.join();
}
for (int num = 1; num <= loop_count; num ) {
// every number appear 20 times only!
int times = 0;
for (int j = 0; j < thread_count; j ) {
for (int k = 0; k < loop_count; k ) {
if (results[j][k] == num)
times ;
}
}
assertequals(20, times);
}
}

这个测试跑了几次都是正确的。

由于没有可以进行对比其他的实现方式（例如排他锁）所以没有做性能测试。

写在最后。虽然mongodb支持类似于存储过程的stored javascript，但是其实不建议使用这个来解决复杂问题。主要原因是没法调试，维护起来太不方便。而且在2.4之前mongodb对服务端 javascript支持并不是很好，一个mongod进程同时只能执行一段javascript。如果能在应用层解决掉还是在应用层里实现逻辑比较好。

黑灵 2013-04-26 22:57

]]>

输出debug信息到postfix的log

黑灵 — fri, 26 apr 2013 11:25:00 gmt

/etc/postfix/main.cf

debug_peer_list = example.com
debug_peer_level = 2

/etc/postfix/master.cf

smtp inet n - n - - smtpd -v

黑灵 2013-04-26 19:25

]]>

java里的compareandset(cas)

黑灵 — wed, 24 apr 2013 09:20:00 gmt

atomic 从jdk5开始, java.util.concurrent包里提供了很多面向并发编程的类. 使用这些类在多核cpu的机器上会有比较好的性能.
主要原因是这些类里面大多使用(失败-重试方式的)乐观锁而不是synchronized方式的悲观锁.

今天有时间跟踪了一下atomicinteger的incrementandget的实现.
本人对并发编程也不是特别了解, 在这里就是做个笔记, 方便以后再深入研究.

1. incrementandget的实现

    public final int incrementandget() {
        for (;;) {
            int current = get();
            int next = current  1;
            if (compareandset(current, next))
                return next;
        }
    }

首先可以看到他是通过一个无限循环(spin)直到increment成功为止.
循环的内容是
1.取得当前值
2.计算 1后的值
3.如果当前值还有效(没有被)的话设置那个 1后的值
4.如果设置没成功(当前值已经无效了即被别的线程改过了), 再从1开始.

2. compareandset的实现

    public final boolean compareandset(int expect, int update) {
        return unsafe.compareandswapint(this, valueoffset, expect, update);
    }

直接调用的是unsafe这个类的compareandswapint方法
全称是sun.misc.unsafe. 这个类是oracle(sun)提供的实现. 可以在别的公司的jdk里就不是这个类了

3. compareandswapint的实现

    /**
     * atomically update java variable to x if it is currently
     * holding expected.
     * @return true if successful
     */
    public final native boolean compareandswapint(object o, long offset,
                                                  int expected,
                                                  int x);

可以看到, 不是用java实现的, 而是通过jni调用操作系统的原生程序.

4. compareandswapint的native实现
如果你下载了openjdk的源代码的话在hotspot\src\share\vm\prims\目录下可以找到unsafe.cpp

unsafe_entry(jboolean, unsafe_compareandswapint(jnienv *env, jobject unsafe, jobject obj, jlong offset, jint e, jint x))
  unsafewrapper("unsafe_compareandswapint");
  oop p = jnihandles::resolve(obj);
  jint* addr = (jint *) index_oop_from_field_offset_long(p, offset);
  return (jint)(atomic::cmpxchg(x, addr, e)) == e;
unsafe_end

可以看到实际上调用atomic类的cmpxchg方法.

5. atomic的cmpxchg
这个类的实现是跟操作系统有关, 跟cpu架构也有关, 如果是windows下x86的架构
实现在hotspot\src\os_cpu\windows_x86\vm\目录的atomic_windows_x86.inline.hpp文件里

inline jint     atomic::cmpxchg    (jint     exchange_value, volatile jint*     dest, jint     compare_value) {
  // alternative for interlockedcompareexchange
  int mp = os::is_mp();
  __asm {
    mov edx, dest
    mov ecx, exchange_value
    mov eax, compare_value
    lock_if_mp(mp)
    cmpxchg dword ptr [edx], ecx
  }
}

在这里可以看到是用嵌入的汇编实现的, 关键cpu指令是 cmpxchg
到这里没法再往下找代码了. 也就是说cas的原子性实际上是cpu实现的. 其实在这一点上还是有排他锁的. 只是比起用synchronized, 这里的排他时间要短的多. 所以在多线程情况下性能会比较好.

代码里有个alternative for interlockedcompareexchange
这个interlockedcompareexchange是winapi里的一个函数, 做的事情和上面这段汇编是一样的
http://msdn.microsoft.com/en-us/library/windows/desktop/ms683560(v=vs.85).aspx

6. 最后再贴一下x86的cmpxchg指定

opcode cmpxchg

cpu: i486
type of instruction: user

instruction: cmpxchg dest, src

description: compares the accumulator with dest. if equal the "dest"
is loaded with "src", otherwise the accumulator is loaded
with "dest".

flags affected: af, cf, of, pf, sf, zf

cpu mode: rm,pm,vm,smm

clocks:
cmpxchg reg, reg 6
cmpxchg mem, reg 7 (10 if compartion fails)

黑灵 2013-04-24 17:20

]]>

apache mina 中文文档翻译 - 特性

黑灵 — tue, 23 apr 2013 14:02:00 gmt

http://mina.apache.org/mina-project/features.html

mina是一个简单的却有功能丰富的网络应用程序框架，它提供如下特性：

为各种传输类型提供一套统一的api

通过java nio实现 tcp/ip & upd/ip通信

通过rxtx实现串口通信（rs232）

vm内部管道通信

你可以实现自己的通信方式

通过filter接口实现扩展点；类似与servlet的filter

低级和高级的api

低级：使用bytebuffer

高级：用户自定义的消息对象和编码

可以自由定制的线程模型

单线程

一个线程池

多个线程池（例如）

利用java5的sslengine实现的开箱即用的ssl,tls, starttls功能

过载保护和带宽限制

通过mock对象可以进行单体测试

通过jmx管理服务器

通过streamiohandler支持基于流的i/o

可以整合进picocontainer和spring等常用容器

很容易从netty迁移过来。

黑灵 2013-04-23 22:02

]]>

apache mina 中文文档翻译 - 概述

黑灵 — mon, 22 apr 2013 15:13:00 gmt

原文链接：http://mina.apache.org/mina-project/index.html

apache mina 是一个网络应用框架，它可以帮助你简单容易的开发高性能，高可扩展性的网络应用程序。apache mina底层利用java nio实现，在tcp/ip和upd/ip等传输层上提供一个抽象的基于事件驱动的异步api。

apache mina一般被称为：

一个nio框架或库

客户端，服务器框架或库

一个网络socket库

尽管如此，apache mina要提供的比上面说的多得多。你可以看一下它的功能特性列表，利用这些特性你可以快速开发网络应用程序，你还可以看一下人们是怎么说mina的。请下载mina的包，尝试一下快速开始指南，浏览一下faq或者加入凯发天生赢家一触即发官网的社区。

notice: licensed to the apache software foundation (asf) under one or more contributor license agreements. see the notice file distributed with this work for additional information regarding 凯发天生赢家一触即发官网 copyright ownership. the asf licenses this file to you under the apache license, version 2.0 (the "license"); you may not use this file except in compliance with the license. you may obtain a copy of the license at . http://www.apache.org/licenses/license-2.0 . unless required by applicable law or agreed to in writing, software distributed under the license is distributed on an "as is" basis, without warranties or conditions of any kind, either express or implied. see the license for the specific language governing permissions and limitations under the license.

黑灵 2013-04-22 23:13

]]>

properties.storetoxml方法抛出空指针哦!

黑灵 — fri, 19 apr 2013 08:42:00 gmt

类似下面这段代码:

    @test(expected = ioexception.class)
    public void testpropertiesstoretoxml() throws ioexception {
        properties props = new properties();
        props.put("key1", true);
        bytearrayoutputstream baos = new bytearrayoutputstream();
        props.storetoxml(baos,null);
        string xml = new string(baos.tobytearray());
        assert.fail("should not go to here");
    }

在生成xml的时候会抛出ioexception. 导致这个ioexception的是做xmltransform的时候出现了nullpointerexception

感觉很奇怪, 调试进properties的代码看了一下.

    public string getproperty(string key) {
    object oval = super.get(key);
    string sval = (oval instanceof string) ? (string)oval : null;
    return ((sval == null) && (defaults != null)) ? defaults.getproperty(key) : sval;
    }

原来properties这货, 不是string的属性一码色的返回null啊.

结果在xmltransform的时候, 直接对这个null进行方法调用.

后来看了一下properties文档, properties继承至hashtable, 所以有put和putall之类的方法. 但是不建议使用,
因为这些方法不限定string类型. 推荐使用setproperty方法, 这个方法的值一定是string.

because properties inherits from hashtable, the put and putall methods can be applied to a properties object. their use is strongly discouraged as they allow the caller to insert entries whose keys or values are not strings. the setproperty method should be used instead. if the store or save method is called on a "compromised" properties object that contains a non-string key or value, the call will fail. similarly, the call to the propertynames or list method will fail if it is called on a "compromised" properties object that contains a non-string key.

ok,我承认是我不好好看文档就用了. 但是我脚的如果你把非string的值调用一下tostring再使用不是更好吗?

黑灵 2013-04-19 16:42

]]>

关于spring-mvc的initbinder注解的参数

黑灵 — tue, 16 apr 2013 16:42:00 gmt

比如设置validator。
我一直在想能不能为每个request或者每个action方法单独设置validator。
也就是说controller里有多个被@initbinder标注的方法。在不同的action时被分别调用。

我注意到了@initbinder的value参数，

api docs里是这样描述的：
the names of command/form attributes and/or request parameters that this init-binder method is supposed to apply to.

default is to apply to all command/form attributes and all request parameters processed by the annotated handler class. specifying model attribute names or request parameter names here restricts the init-binder method to those specific attributes/parameters, with different init-binder methods typically applying to different groups of attributes or parameters.

是乎是可以针对不同的form对象或命令调用不同的initbinder方法。

于是我写了下面的controller试了一下

@controller
public class homecontroller {

    private static final logger logger = loggerfactory.getlogger(homecontroller.class);

    @initbinder("action1")
    public void initbinder1(webdatabinder binder){
        logger.info("initbinder1");
    }

    @initbinder("action2")
    public void initbinder2(webdatabinder binder){
        logger.info("initbinder2");
    }

    /**
     * simply selects the home view to render by returning its name.
     */
    @requestmapping(value = "/", method = requestmethod.get)
    public string home(model model) {

        date date = new date();
        dateformat dateformat = dateformat.getdatetimeinstance(dateformat.long, dateformat.long);

        string formatteddate = dateformat.format(date);

        model.addattribute("servertime", formatteddate );

        return "home";
    }

    /**
     * simply selects the home view to render by returning its name.
     */
    @requestmapping(value = "/doit", method = requestmethod.post, params="action1")
    public string doit1(@requestparam("value1") int value1,
            @requestparam("action1") string action, model model) {
        logger.info("value1={}",value1);

        date date = new date();
        dateformat dateformat = dateformat.getdatetimeinstance(dateformat.long, dateformat.long);

        string formatteddate = dateformat.format(date);

        model.addattribute("servertime", formatteddate );

        return "home";
    }

    /**
     * simply selects the home view to render by returning its name.
     */
    @requestmapping(value = "/doit", method = requestmethod.post, params="action2")
    public string doit2(@requestparam("value1") int value1,
            @requestparam("action2") string action, model model) {
        logger.info("value1={}",value1);

        date date = new date();
        dateformat dateformat = dateformat.getdatetimeinstance(dateformat.long, dateformat.long);

        string formatteddate = dateformat.format(date);

        model.addattribute("servertime", formatteddate );

        return "home";
    }
}

画面上的form是这样的

我的意愿是如果画面上，点击action1按钮择调用initbinder1方法。如果点击action2按钮则调用initbinder2方法。

实际上也确实是这样的。

当点击action1时，logger.info("initbinder1"); 会执行

当点击action2是，logger.info("initbinder2"); 会执行

是乎是可以实现“每个action方法单独设置validator”这个目的。

但是其实并不是这样的。

我调式进去了initbinder相关的源代码initbinderdatabinderfactory，结果发现它只是在对action这个参数绑定的时候调用了上面这个方法，对value1绑定的时候那个方法都没有调用。而我们常常要解决的是对同一个参数对于不同的action应用不同的validator。

所以目前还是没有好的方法能直接解决这个问题！

也许我们可以自己实现一个databinderfactory来解决这个问题。

黑灵 2013-04-17 00:42

]]>

kmp算法里的next函数是怎么得到的？

黑灵 — sun, 14 apr 2013 15:24:00 gmt

也就是这个'p⑴ p⑵ p⑶…..p(k-1）’ = ' p(j-k 1）p(j-k 2）……p(j-1）’。为什么找到这个k以后用k的元素比较字符串中i就行了。

现在找到一个最接近明白的就是百度百科上的

假设
主串：s: ‘s⑴ s⑵ s⑶ ……s(n）’ ;
模式串：p: ‘p⑴ p⑵ p⑶…..p(m）’
把课本上的这一段看完后，继续
现在我们假设主串第i个字符与模式串的第j(j<=m）个字符‘失配’后，主串第i个字符与模式串的第k(k此时，s(i）≠p(j），有
主串：s⑴…… s(i-j 1）…… s(i-1） s(i) ………….
|| （相配） || ≠（失配）
匹配串：p⑴ ...........p(j-1） p(j)
由此，我们得到关系式：
‘p⑴ p⑵ p⑶…..p(j-1）’ = ’ s(i-j 1）……s(i-1）’
由于s(i）≠p(j），接下来s(i）将与p(k）继续比较，则模式串中的前（k-1）个字符的子串必须满足下列关系式，并且不可能存在 k’>k 满足下列关系式：（k‘p⑴ p⑵ p⑶…..p(k-1）’ = ’ s(i-k 1）s(i-k 2）……s(i-1）’
即：
主串：s⑴……s(i-k 1） s(i-k 2） ……s(i-1） s(i) ………….
|| （相配） || ||（有待比较）
匹配串：p⑴ p⑵ ……..... p(k-1） p(k)
现在我们把前面总结的关系综合一下
有：
s⑴…s(i-j 1）… s(i-k 1） s(i-k 2） …… s(i-1） s(i) ……
|| （相配） || || || ≠（失配）
p⑴ ……p(j-k 1） p(j-k 2） …...... p(j-1） p(j)
|| （相配） || ||（有待比较）
p⑴ p⑵ ……...... p(k-1） p(k)
由上，我们得到关系：
'p⑴ p⑵ p⑶…..p(k-1）’ = ' p(j-k 1）p(j-k 2）……p(j-1）’

不知道是从哪个课本上抄来的。其实还是不太明白最后一步。

黑灵 2013-04-14 23:24

]]>

简单的监视某个端口的连接数的linux命令

黑灵 — tue, 30 oct 2012 01:38:00 gmt

#!/bin/sh
netstat -anp | grep :$1 | awk '{print $5}' | awk -f: '{print $1}' | sort | uniq -c

然后执行命令:

watch -n 1 count_conn 8080

最后的参数就是你要监视的端口.

黑灵 2012-10-30 09:38

]]>