依据设备情况自动选择location ...- 关于几十万词汇量词典检索...- Runtime.getRuntime().exec(cmd)使用不当唤起的java.io.IOException: Too many open files

当前位置: 编程技术>移动开发

本页文章导读:

▪依据设备情况自动选择location provider 根据设备情况自动选择location provider //获取经纬度 locationManager = (LocationManager) AreaOneActivity.this.getSystemService(Context.LOCATION_SERVICE); //获取最适合的定位服务 String provider = AreaOneActivity.this.getB.........

▪ 关于几十万词汇量词典检索的讨论，希望大家见仁见智，帮小弟我提出些意见吧关于几十万词汇量词典检索的讨论，希望大家见仁见智，帮我提出些意见吧！目前YourDict词典程序大部分工作已经完成，英文检索速度小于1s，还算可以能接受，但汉语词典检索成了一个.........

▪ Runtime.getRuntime().exec(cmd)使用不当唤起的java.io.IOException: Too many open files Runtime.getRuntime().exec(cmd)使用不当引起的java.io.IOException: Too many open files 转自：http://www.blogjava.net/jnbzwm/archive/2010/09/14/332009.html今天生产环境的一个Java应用程序的日志里，出现了很不和谐的记.........

[1]依据设备情况自动选择location provider

来源: 互联网发布时间: 2014-02-18

根据设备情况自动选择location provider

//获取经纬度
locationManager = (LocationManager) AreaOneActivity.this.getSystemService(Context.LOCATION_SERVICE);
				
//获取最适合的定位服务
String provider = AreaOneActivity.this.getBestProvider();

if (provider!=null && provider.length() >0) {
locationManager.requestLocationUpdates(provider, 5000, 0, new AreaLocationListener());
}else {
Toast.makeText(getApplicationContext(),“不能进行定位”, Toast.LENGTH_SHORT).show();
}

/**

	 * 获取最适合的定位服务
	 * @return 定位服务类型
	 */
	private String getBestProvider(){
		Criteria criteria = new Criteria();
		criteria.setAccuracy(Criteria.ACCURACY_COARSE);
		criteria.setAltitudeRequired(false);
		criteria.setBearingRequired(false);
		criteria.setSpeedRequired(false);
		String provider = locationManager.getBestProvider(criteria, true);
		
		return provider;
	}

[2] 关于几十万词汇量词典检索的讨论，希望大家见仁见智，帮小弟我提出些意见吧

来源: 互联网发布时间: 2014-02-18

关于几十万词汇量词典检索的讨论，希望大家见仁见智，帮我提出些意见吧！
目前YourDict词典程序大部分工作已经完成，英文检索速度小于1s，还算可以能接受，但汉语词典检索成了一个问题，汉语词典动辄60几万词的词量，让我程序检索时间骤然上升到了10几秒。。。让我很苦恼，在思考解决方法的过程中发现这也算是计算机科学的一个经典问题了，只不过这回是在Android，这样一个内存环境极度匮乏的情况下，建树等方式需要极端小心。。。希望大家能给我提出一个切实可行的方案！

如下是我在google群组里提出的问题，现在基本是两个方案database，和trie。。。一开始我觉得kris的数据库建议不错，但Christopher提出的关于数据库实时效率差的问题也让我有些头疼。。。如果都试的话工作量太大。。。大家有什么好的方案。。。希望实践过trie树的童鞋能告诉我一下他的效率和编程时的注意事项，也希望大家共同进步！

felix

查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午1时26分
Hi!
I'm working on a dict app on android,
I need to search a list of words(about 500-600 thousand words) in file
to find the word.
It took me about 10-20 seconds to search the word. How can I improve
the search speed?
Thanks to all!

    回复     回复作者      转发

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午1时49分
Ah, what a classic question in computer science :-)
To really get the answer to this question, you're going to have to
learn a little bit about data structures. Wait... How is it taking
you *20 seconds* to find the word!? That's absurd! Really? You're
doing string comparisons over 500 strings and it's taking you 20
seconds!?

Anyway, there are two solutions, you might just try using a database,
(not a bad idea, actually), or you might use a hash table (lookup
"HashTable"), if you want to check for bogus words before searching
(okay so this is a bit of a stretch and probably not useful but I
think it deserves a mention) you can look at using a bloom filter...
Obviously there are tons of other data structures you can use too.

Kris

P.s., (did I mention that you should probably be using a database, as,
for Android, it's probably going the best acceptable solution that is
fairly extensible. I'm sure somebody might bring up the possible
badness of having it out on the SD card somewhere, but even this isn't
so bad, especially compared to 20 seconds!)

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午1时49分
OH! Very sorry! I didn't see the 500, thousand!!!
Kris

On Tue, Dec 20, 2011 at 12:49 AM, Kristopher Micinski

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Jim Graham
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时27分

On Mon, Dec 19, 2011 at 09:26:11PM -0800, felix wrote:
> Hi!
> I'm working on a dict app on android,
> I need to search a list of words(about 500-600 thousand words) in file
> to find the word.
> It took me about 10-20 seconds to search the word. How can I improve
> the search speed?

Well, along with Kris's solutions, here's another (that you could use
with his, or on its own if it's enough):
Use whatever works best for you (regexp or simply grabbing the first
char directly from the string) and get the first character (or first
two, or ... and so on) and split your data accordingly. That way,
instead of searching through the WHOLE LIST for zulu, you'd only search
words starting with 'z' (or "zu", etc.). It would no doubt work better
combined with Kris's ideas.

Later,
   --jim

--
THE SCORE: ME: 2 CANCER: 0
73 DE N5IAL (/4)        MiSTie #49997 < Running FreeBSD 7.0 >
spooky1...@gmail.com ICBM/Hurricane: 30.44406N 86.59909W

      "'Wrong' is one of those concepts that depends on witnesses."
     --Catbert: Evil Director of Human Resources (Dilbert, 05Nov09)

Android Apps Listing at http://www.jstrack.org/barcodes.html

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时30分

- 显示引用的文字 -

Jim, this is a good solution, but I would argue that if he does indeed
"read about data structures" (which I nebulously proposed he do), he
might stumble upon a trie:
http://en.wikipedia.org/wiki/Trie

Which is basically what you propose.

(I'm not trying to be condescending here, I'm really trying to point
the OP to another data structure he could consider.)

kris

    回复     回复作者      转发

举报垃圾内容

Jim Graham
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时35分

On Tue, Dec 20, 2011 at 01:30:10AM -0500, Kristopher Micinski wrote:
> On Tue, Dec 20, 2011 at 1:27 AM, Jim Graham <spooky1...@gmail.com> wrote:
> > On Mon, Dec 19, 2011 at 09:26:11PM -0800, felix wrote:
> http://en.wikipedia.org/wiki/Trie

Wow...I never knew that had a name. :-)
Later,
   --jim

--
THE SCORE: ME: 2 CANCER: 0
73 DE N5IAL (/4)        MiSTie #49997 < Running FreeBSD 7.0 >
spooky1...@gmail.com ICBM/Hurricane: 30.44406N 86.59909W

      "'Wrong' is one of those concepts that depends on witnesses."
     --Catbert: Evil Director of Human Resources (Dilbert, 05Nov09)

Android Apps Listing at http://www.jstrack.org/barcodes.html

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时36分

On Tue, Dec 20, 2011 at 1:35 AM, Jim Graham <spooky1...@gmail.com> wrote:
> On Tue, Dec 20, 2011 at 01:30:10AM -0500, Kristopher Micinski wrote:
>> On Tue, Dec 20, 2011 at 1:27 AM, Jim Graham <spooky1...@gmail.com> wrote:
>> > On Mon, Dec 19, 2011 at 09:26:11PM -0800, felix wrote:
>> http://en.wikipedia.org/wiki/Trie

> Wow...I never knew that had a name. :-)

> Later,
>   --jim

"A common application of a trie is storing a dictionary, such as one
found on a mobile telephone. "
:-)

Kris

P.s., (promise I didn't write that.)

    回复     回复作者      转发

举报垃圾内容

felix
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时46分
Thanks a lot! I think I'll give database a try!:)
On 12月20日, 下午1时49分, Kristopher Micinski <krismicin...@gmail.com>
wrote:

- 显示引用的文字 -

    回复     回复作者      转发

felix
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时47分
I've considered trie. But it consumes a lot of memory to construct...
On 12月20日, 下午2时35分, Jim Graham <spooky1...@gmail.com> wrote:

- 显示引用的文字 -

    回复     回复作者      转发

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时51分
But you only have to construct it once.
Many data structures with good lookup perf will take time to set up

Kris

P.s., However, databases are highly evolved, and do all of this very
efficiently, so the whole argument is somewhat silly, as if you just
use one you'll be fine.

2011/12/20 felix <guofuchu...@gmail.com>:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Christopher Van Kirk
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午2时54分
A conventional database isn't going to do better than a Trie, I think.
On 12/20/2011 2:46 PM, felix wrote:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午3时08分
Right,
But it does have the advantage that the technology on Android is
already there, so he doesn't have to write the implementation himself,
or grab one and learn to use it off the web.

kris

2011/12/20 Christopher Van Kirk <christopher.vank...@gmail.com>:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Christopher Van Kirk
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午3时11分
Then again a Trie isn't really that hard to write.
On 12/20/2011 3:08 PM, Kristopher Micinski wrote:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午3时16分
Right,
but getting the huge thing in the right format, storing that
statically, etc.., vs preloading the app with a database, which sounds
easier? I just think the database sounds like the better way to go on
this one, and I'm biased to not reinventing the wheel, but the OP is
obviously free to use whatever..

kris

On Tue, Dec 20, 2011 at 2:11 AM, Christopher Van Kirk

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

martypantsROK
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午5时10分
Don't forget there are more than data structures involved here.
The method searching could be improved. As Jim suggested, breaking
things down with an index (search for zulu beginning in the z section)
could be sped up even more. Search for the last letter in the string
first. By searching for that 4th character "u" first you've
eliminated
3 other characters and can skip on to the next word. That way,
similar
words like zuch or zucchini won't slow you down matching the first two
characters. Works even better for longer words.
Marty

On Dec 20, 4:16 pm, Kristopher Micinski <krismicin...@gmail.com>
wrote:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Solution 9420
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午5时40分
Hi,
I'm the auther of 9420 Thai Keyboard which incorporates English word
suggestion feature as well.
I've around 200K words and can look up in average of 80 mSec.
Assuming the word DB is static, I've done the following...
1. pre-sorted your word in file.
2. pre-index your words.
3. Use binary search tree algorithm.

You'll have to a bit careful the size of the index file, and very
optimized on memory usage to avoid the delay from JAVA gabage
collection as well.

Cheers,
Solution 9420...

www.solution9420.com

On Dec 20, 12:26 am, felix <guofuchu...@gmail.com> wrote:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午5时43分

On Tue, Dec 20, 2011 at 4:10 AM, martypantsROK <martyg...@gmail.com> wrote:
> Don't forget there are more than data structures involved here.
> The method searching could be improved. As Jim suggested, breaking
> things down with an index (search for zulu beginning in the z section)
> could be sped up even more. Search for the last letter in the string
> first. By searching for that 4th character "u" first you've
> eliminated
> 3 other characters and can skip on to the next word. That way,
> similar
> words like zuch or zucchini won't slow you down matching the first two
> characters. Works even better for longer words.
> Marty

I guess my point in all of this is that this searching is highly tied
to your data structure. Good algorithms only work with good data
structures to back them. And there are many indexing and optimization
techniques you can use to get more efficiency. My point is, that
since you can argue all day over these things getting more and more
complicated data structures and searching algorithms (each becoming
more and more context dependent), most of the time for this
application using a database will suffice. If you use a database,
whose indexing method is already going to be pretty good, and find it
doesn't suit your needs, *then* you can switch over to using something
fancier, though I highly doubt you'd need anything much fancier than a
trie in this case.
SQLite is using B+ trees for tables, while this isn't *amazing*
(especially compared to what you'll see with a trie), it's still going
to be massively better (where massively = logarithmic), than just
linear search. Along with this, it looks like "Solutin 9420" shared
his advice... And don't forget about the bloom filter, (this won't
actually help you that much unless you're doing a bunch of queries in
a row, most of which might not be int he database, but I wanted to
bring it up again anyway..)

kris

    回复     回复作者      转发

举报垃圾内容

Christopher Van Kirk
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午5时59分
Three points.
1) Building the searching functionality twice is far more expensive than
building it once, no matter what approach you use. Be sure that the
performance of the DB approach is acceptable before you go and build it
that way.

2) It can be quite challenging to get decent performance out of a
database for something like this, depending on the functionality
required. If, for example, you need real-time narrowing down of words, a
database is going to be very slow (e.g. as you type letters, you get an
alphabetized list of what's in the db).

3) There's probably an open source Trie out there somewhere that you can
just use.

Directed at the OP, of course.

Cheers...

On 12/20/2011 5:43 PM, Kristopher Micinski wrote:

- 显示引用的文字 -

    回复     回复作者      转发

举报垃圾内容

Kristopher Micinski
查看个人资料   翻译成中文（简体）更多选项 12月20日, 下午6时04分
On Tue, Dec 20, 2011 at 4:59 AM, Christopher Van Kirk

<christopher.vank...@gmail.com> wrote:
> Three points.
> 1) Building the searching functionality twice is far more expensive than
> building it once, no matter what approach you use. Be sure that the
> performance of the DB approach is acceptable before you go and build it that
> way.

Okay.

> 2) It can be quite challenging to get decent performance out of a database
> for something like this, depending on the functionality required. If, for
> example, you need real-time narrowing down of words, a database is going to
> be very slow (e.g. as you type letters, you get an alphabetized list of
> what's in the db).

True..

> 3) There's probably an open source Trie out there somewhere that you can
> just use.

Right, which is what I suggested in the first place if he goes this direction..
http://wikipedia-clustering.speedblue.org/trieJava.php

kris

    回复     回复作者      转发

举报垃圾内容

[3] Runtime.getRuntime().exec(cmd)使用不当唤起的java.io.IOException: Too many open files

来源: 互联网发布时间: 2014-02-18

Runtime.getRuntime().exec(cmd)使用不当引起的java.io.IOException: Too many open files
转自：http://www.blogjava.net/jnbzwm/archive/2010/09/14/332009.html

今天生产环境的一个Java应用程序的日志里，出现了很不和谐的记录：
java.io.IOException: Too many open files

在网上查了一些关于此异常的，基本上都是说要扩大linux系统的文件句柄数限制。
但如果程序对于Socket、Stream等使用后没能及时关闭的话，扩大这个文件句柄数限制是治标不治本的。

我先是在测试环境扩大了linux的文件句柄数限制，随后提高测试压力，过一段时间后发现还是会报这个异常。
（中间也用lsof命令查看占用的文件句柄数，不断的增加啊，心寒啊。）
现象是用 lsof -p *** 来查看，形如
java 22055 webapp 21w FIFO 0,6 29300342 pipe
java 22055 webapp 22r FIFO 0,6 29256305 pipe
在不断增加。

所以我果断对代码进行了排查。文件的IO操作、对数据库的操作，看了都没有什么问题，
最后排查到由Java程序去调用Shell脚本的代码，

代码写的还是很简单的，看上去很清晰，但是有明显的问题：

Process proc = Runtime.getRuntime().exec(cmd);
//略对proc.getErrorStream()、proc.getInputStream()流的操作。
proc.waitFor();
return proc.exitValue();

这里的问题是对流没有在finally处做关闭处理。这个问题比较明显。
还有一个问题就是Process的使用问题，

如果对Process的不熟悉的话，可能会以为return proc.exitValue();之后就万事大吉了。
（exitValue()确实很像是已经退出了并得到返回值的意思，估计是这个方法的名字迷惑了我们的开发人员。）
实际不然，看Jdk的帮助文档可以发现，要通过destroy()来实现对子进程的销毁并释放占用的File Descriptor。

这个问题，短时间的测试是不会有问题的，但在投入生产后，随着程序的长期运行，开发中的疏忽就会暴露了。
所以在对使用的方法拿不准的情况下，还是要多做调查，谨慎使用啊。

希望能让在排查类似问题的朋友注意，如果你排查的代码中也存在Runtime.getRuntime().exec(cmd)这样的调用，那么请确保那段代码没有问题。