orangearg.argument.miner.processor
Argument processor module.
Module Contents
Functions
|
With an arbitrary number of lists as input, check if they are in the same size. |
|
Aggregate a list according to elements of another list. |
|
Get argument topics. |
|
Get argument sentiment score. |
|
Get argument coherence. |
|
Return a copy of argument dataframe, with new columns of argument topics, sentiments, and coherences. |
- orangearg.argument.miner.processor._match_list_size(*args: List)[source]
With an arbitrary number of lists as input, check if they are in the same size.
- orangearg.argument.miner.processor._aggregate_list_by_another(keys: List, values: List) Dict[source]
Aggregate a list according to elements of another list.
- Parameters:
keys (List) – The group keys.
values (List) – The list to be aggregated.
- Returns:
The aggregation result.
- Return type:
Dict
- orangearg.argument.miner.processor.get_argument_topics(arg_ids: List[int], topics: List[int]) List[Tuple[int]][source]
Get argument topics.
The topics of an argument is a combination of the topics of all chunks that belong to this argument. Duplications are not removed, and the reason behind is that duplications can be treated as a sign of topic importance. Also, even though two chunks can belong to the same topic, they could still have different ranks within an argument.
- Parameters:
arg_ids (List[int]) – the argument ids of chunks.
topics (List[int]) – the topic indices of chunks.
- Returns:
list of argument topics, which is also a list containing topic indices of chunks belonging to this argument.
- Return type:
List[list[int]]
- orangearg.argument.miner.processor.get_argument_sentiment(arg_ids: List[int], ranks: List[float], p_scores: List[float], min_sent: int = -1, max_sent: int = 1) List[float][source]
Get argument sentiment score.
The sentiment score of an argument is calculated as a weighted sum of sentiment scores of chunks belonging to this argument, where weights are ranks of the chunks. The result score is then normalized into range [0, 1].
- Parameters:
arg_ids (List[int]) – the argument ids of chunks.
ranks (List[float]) – the pagerank of chunks within arguments.
p_scores (List[float]) – the sentiment polarity scores of chunks.
min_sent (int) – minimun of argument sentiment before normalization. Defaults to -1.
max_sent (int) – maximum of argument sentiment before normalization. Defaults to 1.
- Returns:
List of argument sentiment scores, which are floats in range [0, 1].
- Return type:
List[float]
- orangearg.argument.miner.processor.get_argument_coherence(scores: List[int], sentiments: List[float], min_score: int = 1, max_score: int = 5, variance: float = 0.2) List[float][source]
Get argument coherence.
Coherence is computed as inversed difference between sentiments and overall scores. Overall scores are first normalized into the same range as argument sentiments, which is [0, 1]. Then their differences are computed and applied a Gaussian kernal to invert and scale the differences to [0, 1].
- Parameters:
scores (List[int]) – List of argument overall scores.
sentiments (List[float]) – List of argument sentiment scores.
min_score (int, optional) – Lower bound of scores. Defaults to 1.
max_score (int, optional) – Upper bound of scores. Defaults to 5.
variance (float) – variance of the Gaussian kernal.
- Returns:
List of argument coherence scores, in range of (0, 1]
- Return type:
List[float]
- orangearg.argument.miner.processor.update_argument_table(df_arguments: pandas.DataFrame, topics: List[List[int]], sentiments: List[float], coherences: List[float]) pandas.DataFrame[source]
Return a copy of argument dataframe, with new columns of argument topics, sentiments, and coherences.
- Parameters:
df_arguments (pd.DataFrame) – argument dataframe.
topics (List[List[int]]) – list of argument topics
sentiments (List[float]) – list of argument sentiment scores
coherences (List[float]) – list of argument coherence scores
- Returns:
_description_
- Return type:
pd.DataFrame