java word can lie anywhere in the data string. Suppose, you want to match "java" such that it should be able to match words like ��javap�� or "myjava" or "myjavaprogram" i.e. Output: Start index: 21 End index: 25 java Solution regex : word String data1 = "Searching in words : java javap myjava myjavaprogram" To able to do so, simply don��t use anything. You want to match "java" word in all four places in string "Searching in words : java javap myjava myjavaprogram". Please note that matching above regex with "Also, javap is another tool in JDL bundle" doesn't produce any result i.e. Output: Start index: 7 End index: 11 java Matcher matcher = pattern.matcher(data1) Pattern pattern = pile(regex, Pattern.CASE_INSENSITIVE) String data1 = "Today, java is object oriented language" To run a "spcific word only" search using a regular expression, simply place the word between two word boundaries. Between two characters in the data, where one is a word character and the other is not a word character. After the last character in the data, if the last character is a word character.ģ. Before the first character in the data, if the first character is a word character.Ģ. Strictly speaking, "\b" matches in these three positions:ġ. By itself, it results in a zero-length match. It matches at the start or the end of a word. The regular expression token "\b" is called a word boundary. \Z The end of the input but for the final terminator, if any Instead, they match at certain positions, effectively anchoring the regular expression match at those positions. Word Boundary Matchersīoundary matchers help to find a particular word, but only if it appears at the beginning or end of a line. But it should not match "javap" in "javap is another tool in JDL bundle". We will match "java" in "java is object oriented language". In this tutorial, we will learn to match a specific word in a string Java regex. Keeping in view the importance of these preprocessing tasks, the Regular Expressions (aka Regex) have been developed in different languages in order to ease these text preprocessing tasks.Ī Regular Expression is a text string that describes a search pattern which can be used to match or replace patterns inside a string with a minimal amount of code.
Writing manual scripts for such preprocessing tasks requires a lot of effort and is prone to errors. Similarly, you may want to extract numbers from a text string. For instance, you may want to remove all punctuation marks from text documents before they can be used for text classification. Text preprocessing is one of the most important tasks in Natural Language Processing (NLP). Regex to Match a Specific Word in a String